user/sven/postgresql.git

Age	Commit message (Collapse)	Author
2011-02-21	Fix dangling-pointer problem in before-row update trigger processing.	Tom Lane
	ExecUpdate checked for whether ExecBRUpdateTriggers had returned a new tuple value by seeing if the returned tuple was pointer-equal to the old one. But the "old one" was in estate->es_junkFilter's result slot, which would be scribbled on if we had done an EvalPlanQual update in response to a concurrent update of the target tuple; therefore we were comparing a dangling pointer to a live one. Given the right set of circumstances we could get a false match, resulting in not forcing the tuple to be stored in the slot we thought it was stored in. In the case reported by Maxim Boguk in bug #5798, this led to "cannot extract system attribute from virtual tuple" failures when trying to do "RETURNING ctid". I believe there is a very-low-probability chance of more serious errors, such as generating incorrect index entries based on the original rather than the trigger-modified version of the row. In HEAD, change all of ExecBRInsertTriggers, ExecIRInsertTriggers, ExecBRUpdateTriggers, and ExecIRUpdateTriggers so that they continue to have similar APIs. In the back branches I just changed ExecBRUpdateTriggers, since there is no bug in the ExecBRInsertTriggers case.
2011-02-17	Fix tsmatchsel() to account properly for null rows.	Tom Lane
	ts_typanalyze.c computes MCE statistics as fractions of the non-null rows, which seems fairly reasonable, and anyway changing it in released versions wouldn't be a good idea. But then ts_selfuncs.c has to account for that. Failure to do so results in overestimates in columns with a significant fraction of null documents. Back-patch to 8.4 where this stuff was introduced. Jesper Krogh
2011-02-16	Fix bogus test for hypothetical indexes in get_actual_variable_range().	Tom Lane
	That function was supposing that indexoid == 0 for a hypothetical index, but that is not likely to be true in any non-toy implementation of an index adviser, since assigning a fake OID is the only way to know at EXPLAIN time which hypothetical index got selected. Fix by adding a flag to IndexOptInfo to mark hypothetical indexes. Back-patch to 9.0 where get_actual_variable_range() was added. Gurjeet Singh
2011-02-09	Fix improper matching of resjunk column names for FOR UPDATE in subselect.	Tom Lane
	Flattening of subquery range tables during setrefs.c could lead to the rangetable indexes in PlanRowMark nodes not matching up with the column names previously assigned to the corresponding resjunk ctid (resp. tableoid or wholerow) columns. Typical symptom would be either a "cannot extract system attribute from virtual tuple" error or an Assert failure. This wasn't a problem before 9.0 because we didn't support FOR UPDATE below the top query level, and so the final flattening could never renumber an RTE that was relevant to FOR UPDATE. Fix by using a plan-tree-wide unique number for each PlanRowMark to label the associated resjunk columns, so that the number need not change during flattening. Per report from David Johnston (though I'm darned if I can see how this got past initial testing of the relevant code). Back-patch to 9.0.
2011-02-01	Undefine setlocale() macro on Win32	Magnus Hagander
	New versions of libintl redefine setlocale() to a macro which causes problems when the backend and libintl are linked against different versions of the runtime, which is often the case in msvc builds. Hiroshi Inoue, slightly updated comment by me
2011-02-01	Create new errcode for recovery conflict caused by db drop on master.	Simon Riggs
	Previously reported as ERRCODE_ADMIN_SHUTDOWN, this case is now reported as ERRCODE_DATABASE_DROPPED. No message text change. Unlikely to happen on most servers, so low impact change to allow session poolers to correctly handle this situation. Tatsuo Ishii and Simon Riggs
2011-01-27	Tag 9.0.3REL9_0_3	Marc G. Fournier

2011-01-20	Make ALTER TABLE revalidate uniqueness and exclusion constraints.	Robert Haas
	Failure to do so can lead to constraint violations. This was broken by commit 1ddc2703a936d03953657f43345460b9242bbed1 on 2010-02-07, so back-patch to 9.0. Noah Misch. Regression test by me.
2011-01-12	Fix PlanRowMark/ExecRowMark structures to handle inheritance correctly.	Tom Lane
	In an inherited UPDATE/DELETE, each target table has its own subplan, because it might have a column set different from other targets. This means that the resjunk columns we add to support EvalPlanQual might be at different physical column numbers in each subplan. The EvalPlanQual rewrite I did for 9.0 failed to account for this, resulting in possible misbehavior or even crashes during concurrent updates to the same row, as seen in a recent report from Gordon Shannon. Revise the data structure so that we track resjunk column numbers separately for each subplan. I also chose to move responsibility for identifying the physical column numbers back to executor startup, instead of assuming that numbers derived during preprocess_targetlist would stay valid throughout subsequent massaging of the plan. That's a bit slower, so we might want to consider undoing it someday; but it would complicate the patch considerably and didn't seem justifiable in a bug fix that has to be back-patched to 9.0.
2010-12-28	Avoid unexpected conversion overflow in planner for distant date values.	Tom Lane
	The "date" type supports a wider range of dates than int64 timestamps do. However, there is pre-int64-timestamp code in the planner that assumes that all date values can be converted to timestamp with impunity. Fortunately, what we really need out of the conversion is always a double (float8) value; so even when the date is out of timestamp's range it's possible to produce a sane answer. All we need is a code path that doesn't try to force the result into int64. Per trouble report from David Rericha. Back-patch to all supported versions. Although this is surely a corner case, there's not much point in advertising a date range wider than timestamp's if we will choke on such values in unexpected places.
2010-12-16	Remove optreset from src/port/ implementations of getopt and getopt_long.	Tom Lane
	We don't actually need optreset, because we can easily fix the code to ensure that it's cleanly restartable after having completed a scan over the argv array; which is the only case we need to restart in. Getting rid of it avoids a class of interactions with the system libraries and allows reversion of my change of yesterday in postmaster.c and postgres.c. Back-patch to 8.4. Before that the getopt code was a bit different anyway.
2010-12-13	Tag 9.0.2.REL9_0_2	Marc G. Fournier

2010-12-08	Force default wal_sync_method to be fdatasync on Linux.	Tom Lane
	Recent versions of the Linux system header files cause xlogdefs.h to believe that open_datasync should be the default sync method, whereas formerly fdatasync was the default on Linux. open_datasync is a bad choice, first because it doesn't actually outperform fdatasync (in fact the reverse), and second because we try to use O_DIRECT with it, causing failures on certain filesystems (e.g., ext4 with data=journal option). This part of the patch is largely per a proposal from Marti Raudsepp. More extensive changes are likely to follow in HEAD, but this is as much change as we want to back-patch. Also clean up confusing code and incorrect documentation surrounding the fsync_writethrough option. Those changes shouldn't result in any actual behavioral change, but I chose to back-patch them anyway to keep the branches looking similar in this area. In 9.0 and HEAD, also do some copy-editing on the WAL Reliability documentation section. Back-patch to all supported branches, since any of them might get used on modern Linux versions.
2010-12-07	Fix bugs in the hot standby known-assigned-xids tracking logic. If there's	Heikki Linnakangas
	an old transaction running in the master, and a lot of transactions have started and finished since, and a WAL-record is written in the gap between the creating the running-xacts snapshot and WAL-logging it, recovery will fail with "too many KnownAssignedXids" error. This bug was reported by Joachim Wieland on Nov 19th. In the same scenario, when fewer transactions have started so that all the xids fit in KnownAssignedXids despite the first bug, a more serious bug arises. We incorrectly initialize the clog code with the oldest still running transaction, and when we see the WAL record belonging to a transaction with an XID larger than one that committed already before the checkpoint we're recovering from, we zero the clog page containing the already committed transaction, leading to data loss. In hindsight, trying to track xids in the known-assigned-xids array before seeing the running-xacts record was too complicated. To fix that, hold XidGenLock while the running-xacts snapshot is taken and WAL-logged. That ensures that no transaction can begin or end in that gap, so that in recvoery we know that the snapshot contains all transactions running at that point in WAL.
2010-11-29	Move call to GetTopTransactionId() earlier in LockAcquire(),	Simon Riggs
	removing an infrequently occurring race condition in Hot Standby. An xid must be assigned before a lock appears in shared memory, rather than immediately after, else GetRunningTransactionLocks() may see InvalidTransactionId, causing assertion failures during lock processing on standby. Bug report and diagnosis by Fujii Masao, fix by me.
2010-11-16	The GiST scan algorithm uses LSNs to detect concurrent pages splits, but	Heikki Linnakangas
	temporary indexes are not WAL-logged. We used a constant LSN for temporary indexes, on the assumption that we don't need to worry about concurrent page splits in temporary indexes because they're only visible to the current session. But that assumption is wrong, it's possible to insert rows and split pages in the same session, while a scan is in progress. For example, by opening a cursor and fetching some rows, and INSERTing new rows before fetching some more. Fix by generating fake increasing LSNs, used in place of real LSNs in temporary GiST indexes.
2010-11-10	Don't use __declspec (dllimport) for PGDLLEXPORT to reduce warnings	Itagaki Takahiro
	by gcc version 4 on mingw and cygwin. We don't use dllexport here because dllexport and dllwrap don't work well together.
2010-10-22	Make OFF keyword unreserved. It's not hard to imagine wanting to use 'off'	Heikki Linnakangas
	as a variable or column name, and it's not reserved in recent versions of the SQL spec either. This became particularly annoying in 9.0, before that PL/pgSQL replaced variable names in queries with parameter markers, so it was possible to use OFF and many other backend parser keywords as variable names. Because of that, backpatch to 9.0.
2010-10-19	Fix incorrect generation of whole-row variables in planner.	Tom Lane
	A couple of places in the planner need to generate whole-row Vars, and were cutting corners by setting vartype = RECORDOID in the Vars, even in cases where there's an identifiable named composite type for the RTE being referenced. While we mostly got away with this, it failed when there was also a parser-generated whole-row reference to the same RTE, because the two Vars weren't equal() due to the difference in vartype. Fix by providing a subroutine the planner can call to generate whole-row Vars the same way the parser does. Per bug #5716 from Andrew Tipton. Back-patch to 9.0 where one of the bogus calls was introduced (the other one is new in HEAD).
2010-10-01	Tag 9.0.1	Marc G. Fournier

2010-09-28	Fix PlaceHolderVar mechanism's interaction with outer joins.	Tom Lane
	The point of a PlaceHolderVar is to allow a non-strict expression to be evaluated below an outer join, after which its value bubbles up like a Var and can be forced to NULL when the outer join's semantics require that. However, there was a serious design oversight in that, namely that we didn't ensure that there was actually a correct place in the plan tree to evaluate the placeholder :-(. It may be necessary to delay evaluation of an outer join to ensure that a placeholder that should be evaluated below the join can be evaluated there. Per recent bug report from Kirill Simonov. Back-patch to 8.4 where the PlaceHolderVar mechanism was introduced.
2010-09-22	Convert cvsignore to gitignore, and add .gitignore for build targets.	Magnus Hagander

2010-09-17	tag v9.0.0 ... the big day approachesREL9_0_0	Marc G. Fournier

2010-09-14	Fix join-removal logic for pseudoconstant and outerjoin-delayed quals.	Tom Lane
	In these cases a qual can get marked with the removable rel in its required_relids, but this is just to schedule its evaluation correctly, not because it really depends on the rel. We were assuming that, in effect, we could throw away all quals so marked, which is nonsense. Tighten up the logic to be a little more paranoid about which quals belong to the outer join being considered for removal, and arrange for all quals that don't belong to be updated so they will still get evaluated correctly. Also fix another problem that happened to be exposed by this test case, which was that make_join_rel() was failing to notice some cases where a constant-false qual could be used to prove a join relation empty. If it's a pushed-down constant false, then the relation is empty even if it's an outer join, because the qual applies after the outer join expansion. Per report from Nathan Grange. Back-patch into 9.0.
2010-09-13	Remove prototype for non-existent function from walreceiver.h. Tidy up by	Heikki Linnakangas
	separating prototypes for functions in walreceiver.c and walreceiverfuncs.c with comments.
2010-09-02	Fix up flushing of composite-type typcache entries to be driven directly by	Tom Lane
	SI invalidation events, rather than indirectly through the relcache. In the previous coding, we had to flush a composite-type typcache entry whenever we discarded the corresponding relcache entry. This caused problems at least when testing with RELCACHE_FORCE_RELEASE, as shown in recent report from Jeff Davis, and might result in real-world problems given the kind of unexpected relcache flush that that test mechanism is intended to model. The new coding decouples relcache and typcache management, which is a good thing anyway from a structural perspective. The cost is that we have to search the typcache linearly to find entries that need to be flushed. There are a couple of ways we could avoid that, but at the moment it's not clear it's worth any extra trouble, because the typcache contains very few entries in typical operation. Back-patch to 8.2, the same as some other recent fixes in this general area. The patch could be carried back to 8.0 with some additional work, but given that it's only hypothetical whether we're fixing any problem observable in the field, it doesn't seem worth the work now.
2010-08-27	tag rc1 ... final stretch ...REL9_0_RC1	Marc G. Fournier

2010-08-23	Marginal code cleanup for streaming replication.	Tom Lane
	There is no reason that proc.c should have to get involved in this dirty hack for letting the postmaster know which children are walsenders. Revert that file to the way it was, and confine the kluge to pmsignal.c and postmaster.c.
2010-08-18	Fix failure of "ALTER TABLE t ADD COLUMN c serial" when done by non-owner.	Tom Lane
	The implicitly created sequence was created as owned by the current user, who could be different from the table owner, eg if current user is a superuser or some member of the table's owning role. This caused sanity checks in the SEQUENCE OWNED BY code to spit up. Although possibly we don't need those sanity checks, the safest fix seems to be to make sure the implicit sequence is assigned the same owner role as the table has. (We still do all permissions checks as the current user, however.) Per report from Josh Berkus. Back-patch to 9.0. The bug goes back to the invention of SEQUENCE OWNED BY in 8.2, but the fix requires an API change for DefineRelation(), which seems to have potential for breaking third-party code if done in a minor release. Given the lack of prior complaints, it's probably not worth fixing in the stable branches.
2010-08-12	Correct sundry errors in Hot Standby-related comments.	Robert Haas
	Fujii Masao
2010-08-05	Remove the single-argument form of string_agg(). It added nothing much in	Tom Lane
	functionality, while creating an ambiguity in usage with ORDER BY that at least two people have already gotten seriously confused by. Also, add an opr_sanity test to check that we don't in future violate the newly minted policy of not having built-in aggregates with the same name and different numbers of parameters. Per discussion of a complaint from Thom Brown.
2010-08-01	Back-patch fix for renaming asyncCommitLSN to asyncXactLSN.	Tom Lane
	AIUI this was supposed to go into 9.0 as well as HEAD.
2010-08-01	Fix ANALYZE's ancient deficiency of not trying to collect stats for expression	Tom Lane
	indexes when the index column type (the opclass opckeytype) is different from the expression's datatype. When coded, this limitation wasn't worth worrying about because we had no intelligence to speak of in stats collection for the datatypes used by such opclasses. However, now that there's non-toy estimation capability for tsvector queries, it amounts to a bug that ANALYZE fails to do this. The fix changes struct VacAttrStats, and therefore constitutes an API break for custom typanalyze functions. Therefore we can't back-patch it into released branches, but it was agreed that 9.0 isn't yet frozen hard enough to make such a change unacceptable. Ergo, back-patch to 9.0 but no further. The API break had better be mentioned in 9.0 release notes.
2010-08-01	Fix an additional set of problems in GIN's handling of lossy page pointers.	Tom Lane
	Although the key-combining code claimed to work correctly if its input contained both lossy and exact pointers for a single page in a single TID stream, in fact this did not work, and could not work without pretty fundamental redesign. Modify keyGetItem so that it will not return such a stream, by handling lossy-pointer cases a bit more explicitly than we did before. Per followup investigation of a gripe from Artur Dabrowski. An example of a query that failed given his data set is select count() from search_tab where (to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee: \| dd:')) and (to_tsvector('german', keywords ) @@ to_tsquery('german', 'aa:')); Back-patch to 8.4 where the lossy pointer code was introduced.
2010-08-01	Rewrite the rbtree routines so that an RBNode is the first field of the	Tom Lane
	struct representing a tree entry, rather than being a separately allocated piece of storage. This API is at least as clean as the old one (if not more so --- there were some bizarre choices in there) and it permits a very substantial memory savings, on the order of 2X in ginbulk.c's usage. Also, fix minor memory leaks in code called by ginEntryInsert, in particular in ginInsertValue and entryFillRoot, as well as ginEntryInsert itself. These leaks resulted in the GIN index build context continuing to bloat even after we'd filled it to maintenance_work_mem and started to dump data out to the index. In combination these fixes restore the GIN index build code to honoring the maintenance_work_mem limit about as well as it did in 8.4. Speed seems on par with 8.4 too, maybe even a bit faster, for a non-pathological case in which HEAD was formerly slower. Back-patch to 9.0 so we don't have a performance regression from 8.4.
2010-07-31	Rewrite the key-combination logic in GIN's keyGetItem() and scanGetItem()	Tom Lane
	routines to make them behave better in the presence of "lossy" index pointers. The previous coding was outright incorrect for some cases, as recently reported by Artur Dabrowski: scanGetItem would fail to return index entries in cases where one index key had multiple exact pointers on the same page as another key had a lossy pointer. Also, keyGetItem was extremely inefficient for cases where a single index key generates multiple "entry" streams, such as an @@ operator with a multiple-clause tsquery. The presence of a lossy page pointer in any one stream defeated its ability to use the opclass consistentFn, resulting in probing many heap pages that didn't really need to be visited. In Artur's example case, a query like WHERE tsvector @@ to_tsquery('a & b') was about 50X slower than the theoretically equivalent WHERE tsvector @@ to_tsquery('a') AND tsvector @@ to_tsquery('b') The way that I chose to fix this was to have GIN call the consistentFn twice with both TRUE and FALSE values for the in-doubt entry stream, returning a hit if either call produces TRUE, but not if they both return FALSE. The code handles this for the case of a single in-doubt entry stream, but punts (falling back to the stupid behavior) if there's more than one lossy reference to the same page. The idea could be scaled up to deal with multiple lossy references, but I think that would probably be wasted complexity. At least to judge by Artur's example, such cases don't occur often enough to be worth trying to optimize. Back-patch to 8.4. 8.3 did not have lossy GIN index pointers, so not subject to these problems.
2010-07-30	tag for beta4REL9_0_BETA4	Marc G. Fournier

2010-07-29	Improved version of patch to protect pg_get_expr() against misuse:	Tom Lane
	look through join alias Vars to avoid breaking join queries, and move the test to someplace where it will catch more possible ways of calling a function. We still ought to throw away the whole thing in favor of a data-type-based solution, but that's not feasible in the back branches. This needs to be back-patched further than 9.0, but I don't have time to do so today. Committing now so that the fix gets into 9.0beta4.
2010-07-29	Clean up some inconsistencies in the volatility marking of various I/O	Tom Lane
	related functions. Per today's discussion, we will henceforth assume that datatype I/O functions are either stable or immutable, never volatile. (This implies in particular that domain CHECK constraint expressions shouldn't be volatile, since domain_in executes them.) In turn, functions that execute the I/O functions of arbitrary datatypes should always be labeled stable. This affects the labeling of array_to_string, which was unsafely marked immutable, and record_in, record_out, record_recv, record_send, domain_in, domain_recv, which were over-conservatively marked volatile. The array I/O functions were already marked stable, which is correct per this policy but would have been wrong if we maintained domain_in as volatile. Back-patch to 9.0, along with an earlier fix to correctly mark cash_in and cash_out as stable not immutable (since they depend on lc_monetary). No catversion bump --- the implications of this are not currently severe enough to justify a forced initdb.
2010-07-28	Fix potential failure when hashing the output of a subplan that produces	Tom Lane
	a pass-by-reference datatype with a nontrivial projection step. We were using the same memory context for the projection operation as for the temporary context used by the hashtable routines in execGrouping.c. However, the hashtable routines feel free to reset their temp context at any time, which'd lead to destroying input data that was still needed. Report and diagnosis by Tao Ma. Back-patch to 8.1, where the problem was introduced by the changes that allowed us to work with "virtual" tuples instead of materializing intermediate tuple values everywhere. The earlier code looks quite similar, but it doesn't suffer the problem because the data gets copied into another context as a result of having to materialize ExecProject's output tuple.
2010-07-09	tag beta3REL9_0_BETA3	Marc G. Fournier

2010-07-06	pgindent run for 9.0, second run	Bruce Momjian

2010-07-05	The previous fix in CVS HEAD and 8.4 for handling the case where a cursor	Heikki Linnakangas
	being used in a PL/pgSQL FOR loop is closed was inadequate, as Tom Lane pointed out. The bug affects FOR statement variants too, because you can close an implicitly created cursor too by guessing the "<unnamed portal X>" name created for it. To fix that, "pin" the portal to prevent it from being dropped while it's being used in a PL/pgSQL FOR loop. Backpatch all the way to 7.4 which is the oldest supported version.
2010-07-03	Replace max_standby_delay with two parameters, max_standby_archive_delay and	Tom Lane
	max_standby_streaming_delay, and revise the implementation to avoid assuming that timestamps found in WAL records can meaningfully be compared to clock time on the standby server. Instead, the delay limits are compared to the elapsed time since we last obtained a new WAL segment from archive or since we were last "caught up" to WAL data arriving via streaming replication. This avoids problems with clock skew between primary and standby, as well as other corner cases that the original coding would misbehave in, such as the primary server having significant idle time between transactions. Per my complaint some time ago and considerable ensuing discussion. Do some desultory editing on the hot standby documentation, too.
2010-07-03	Allow REASSIGNED OWNED to handle opclasses and opfamilies.	Robert Haas
	Backpatch to 8.3, which is as far back as we have opfamilies. The opclass portion could probably be backpatched to 8.2, when REASSIGN OWNED was added, but for now I have not done that. Asko Tiidumaa, with minor adjustments by me.
2010-06-29	Use different function names for plpython3 handlers, to avoid clashes in	Peter Eisentraut
	pg_pltemplate This should have a catversion bump, but it's still being debated whether it's worth it during beta.
2010-06-17	Clean up some randomness associated with trace_recovery_messages: don't	Tom Lane
	put the variable declaration in the middle of a bunch of externs, and do use extern where it should be used.
2010-06-17	Don't allow walsender to send WAL data until it's been safely fsync'd on the	Tom Lane
	master. Otherwise a subsequent crash could cause the master to lose WAL that has already been applied on the slave, resulting in the slave being out of sync and soon corrupt. Per recent discussion and an example from Robert Haas. Fujii Masao
2010-06-17	Remove prototype of GetOldestWALSendPointer(), that is marked as NOT_USED.	Itagaki Takahiro

2010-06-15	Add new GUC categories corresponding to sections in docs, and move	Itagaki Takahiro
	description for vacuum_defer_cleanup_age to the correct category. Sections in postgresql.conf are also sorted in the same order with docs. Per gripe by Fujii Masao, suggestion by Heikki Linnakangas, and patch by me.