user/sven/postgresql.git

Age	Commit message (Collapse)	Author
2012-07-12	Fix walsender processes to establish a SIGALRM handler.	Tom Lane
	Walsenders must have working SIGALRM handling during InitPostgres, but they set the handler to SIG_IGN so that nothing would happen if a timeout was reached. This could result in two failure modes: * If a walsender participated in a deadlock during its authentication transaction, and was the last to wait in the deadly embrace, the deadlock would not get cleared automatically. This would require somebody to be trying to take out AccessExclusiveLock on multiple system catalogs, so it's not very probable. * If a client failed to respond to a walsender's authentication challenge, the intended disconnect after AuthenticationTimeout wouldn't happen, and the walsender would wait indefinitely for the client. For the moment, fix in back branches only, since this is fixed in a different way in the timeout-infrastructure patch that's awaiting application to HEAD. If we choose not to apply that, then we'll need to do this in HEAD as well.
2012-07-10	Back-patch fix for extraction of fixed prefixes from regular expressions.	Tom Lane
	Back-patch of commits 628cbb50ba80c83917b07a7609ddec12cda172d0 and c6aae3042be5249e672b731ebeb21875b5343010. This has been broken since 7.3, so back-patch to all supported branches.
2012-07-10	Back-patch addition of pg_wchar-to-multibyte conversion functionality.	Tom Lane
	Back-patch of commits 72dd6291f216440f6bb61a8733729a37c7e3b2d2, f6a05fd973a102f7e66c491d3f854864b8d24844, and 60e9c224a197aa37abb1aa3aefa3aad42da61f7f. This is needed to support fixing the regex prefix extraction bug in back branches.
2012-07-09	Refactor pattern_fixed_prefix() to avoid dealing in incomplete patterns.	Tom Lane
	Previously, pattern_fixed_prefix() was defined to return whatever fixed prefix it could extract from the pattern, plus the "rest" of the pattern. That definition was sensible for LIKE patterns, but not so much for regexes, where reconstituting a valid pattern minus the prefix could be quite tricky (certainly the existing code wasn't doing that correctly). Since the only thing that callers ever did with the "rest" of the pattern was to pass it to like_selectivity() or regex_selectivity(), let's cut out the middle-man and just have pattern_fixed_prefix's subroutines do this directly. Then pattern_fixed_prefix can return a simple selectivity number, and the question of how to cope with partial patterns is removed from its API specification. While at it, adjust the API spec so that callers who don't actually care about the pattern's selectivity (which is a lot of them) can pass NULL for the selectivity pointer to skip doing the work of computing a selectivity estimate. This patch is only an API refactoring that doesn't actually change any processing, other than allowing a little bit of useless work to be skipped. However, it's necessary infrastructure for my upcoming fix to regex prefix extraction, because after that change there won't be any simple way to identify the "rest" of the regex, not even to the low level of fidelity needed by regex_selectivity. We can cope with that if regex_fixed_prefix and regex_selectivity communicate directly, but not if we have to work within the old API. Hence, back-patch to all active branches.
2012-07-05	Don't try to trim "../" in join_path_components().	Tom Lane
	join_path_components() tried to remove leading ".." components from its tail argument, but it was not nearly bright enough to do so correctly unless the head argument was (a) absolute and (b) canonicalized. Rather than try to fix that logic, let's just get rid of it: there is no correctness reason to remove "..", and cosmetic concerns can be taken care of by a subsequent canonicalize_path() call. Per bug #6715 from Greg Davidson. Back-patch to all supported branches. It appears that pre-9.2, this function is only used with absolute paths as head arguments, which is why we'd not noticed the breakage before. However, third-party code might be expecting this function to work in more general cases, so it seems wise to back-patch. In HEAD and 9.2, also make some minor cosmetic improvements to callers.
2012-07-05	Revert part of the previous patch that avoided using PLy_elog().	Heikki Linnakangas
	That caused the plpython_unicode regression test to fail on SQL_ASCII encoding, as evidenced by the buildfarm. The reason is that with the patch, you don't get the detail in the error message that you got before. That detail is actually very informative, so rather than just adjust the expected output, let's revert that part of the patch for now to make the buildfarm green again, and figure out some other way to avoid the recursion of PLy_elog() that doesn't lose the detail.
2012-07-05	Fix mapping of PostgreSQL encodings to Python encodings.	Heikki Linnakangas
	Windows encodings, "win1252" and so forth, are named differently in Python, like "cp1252". Also, if the PyUnicode_AsEncodedString() function call fails for some reason, use a plain ereport(), not a PLy_elog(), to report that error. That avoids recursion and crash, if PLy_elog() tries to call PLyUnicode_Bytes() again. This fixes bug reported by Asif Naeem. Backpatch down to 9.0, before that plpython didn't even try these conversions. Jan Urbański, with minor comment improvements by me.
2012-06-30	Prevent CREATE TABLE LIKE/INHERITS from (mis) copying whole-row Vars.	Tom Lane
	If a CHECK constraint or index definition contained a whole-row Var (that is, "table.*"), an attempt to copy that definition via CREATE TABLE LIKE or table inheritance produced incorrect results: the copied Var still claimed to have the rowtype of the source table, rather than the created table. For the LIKE case, it seems reasonable to just throw error for this situation, since the point of LIKE is that the new table is not permanently coupled to the old, so there's no reason to assume its rowtype will stay compatible. In the inheritance case, we should ideally allow such constraints, but doing so will require nontrivial refactoring of CREATE TABLE processing (because we'd need to know the OID of the new table's rowtype before we adjust inherited CHECK constraints). In view of the lack of previous complaints, that doesn't seem worth the risk in a back-patched bug fix, so just make it throw error for the inheritance case as well. Along the way, replace change_varattnos_of_a_node() with a more robust function map_variable_attnos(), which is capable of being extended to handle insertion of ConvertRowtypeExpr whenever we get around to fixing the inheritance case nicely, and in the meantime it returns a failure indication to the caller so that a helpful message with some context can be thrown. Also, this code will do the right thing with subselects (if we ever allow them in CHECK or indexes), and it range-checks varattnos before using them to index into the map array. Per report from Sergey Konoplev. Back-patch to all supported branches.
2012-06-29	Initialize shared memory copy of ckptXidEpoch correctly when not in recovery.	Heikki Linnakangas
	This bug was introduced by commit 20d98ab6e4110087d1816cd105a40fcc8ce0a307, so backpatch this to 9.0-9.2 like that one. This fixes bug #6710, reported by Tarvi Pillessaar
2012-06-29	Fix NOTIFY to cope with I/O problems, such as out-of-disk-space.	Tom Lane
	The LISTEN/NOTIFY subsystem got confused if SimpleLruZeroPage failed, which would typically happen as a result of a write() failure while attempting to dump a dirty pg_notify page out of memory. Subsequently, all attempts to send more NOTIFY messages would fail with messages like "Could not read from file "pg_notify/nnnn" at offset nnnnn: Success". Only restarting the server would clear this condition. Per reports from Kevin Grittner and Christoph Berg. Back-patch to 9.0, where the problem was introduced during the LISTEN/NOTIFY rewrite.
2012-06-26	Backport fsync queue compaction logic to all supported branches.	Robert Haas
	This backports commit 7f242d880b5b5d9642675517466d31373961cf98, except for the counter in pg_stat_bgwriter. The underlying problem (namely, that a full fsync request queue causes terrible checkpoint behavior) continues to be reported in the wild, and this code seems to be safe and robust enough to risk back-porting the fix.
2012-06-21	Fix memory leak in ARRAY(SELECT ...) subqueries.	Tom Lane
	Repeated execution of an uncorrelated ARRAY_SUBLINK sub-select (which I think can only happen if the sub-select is embedded in a larger, correlated subquery) would leak memory for the duration of the query, due to not reclaiming the array generated in the previous execution. Per bug #6698 from Armando Miraglia. Diagnosis and fix idea by Heikki, patch itself by me. This has been like this all along, so back-patch to all supported versions.
2012-06-19	pg_dump: Fix verbosity level in LO progress messages	Alvaro Herrera
	In passing, reword another instance of the same message that was gratuitously different. Author: Josh Kupershmidt after a bug report by Bosco Rama
2012-06-19	Update copyright year in forgotten places	Peter Eisentraut
	found by Stefan Kaltenbrunner
2012-06-08	Fix bug in early startup of Hot Standby with subtransactions.	Simon Riggs
	When HS startup is deferred because of overflowed subtransactions, ensure that we re-initialize KnownAssignedXids for when both existing and incoming snapshots have non-zero qualifying xids. Fixes bug #6661 reported by Valentine Gogichashvili. Analysis and fix by Andres Freund
2012-06-07	Revert "Wake WALSender to reduce data loss at failover for async commit."	Tom Lane
	This reverts commit 090e8a984cf1a8a3ef7f6db6dc919f843902d80c. Since WalSndWakeup does not exist in 9.0, it's clear that this patch wasn't even compiled in this branch. Perhaps some variant of it is appropriate in 9.0, but for the moment I'm just going to un-break the buildfarm.
2012-06-07	Wake WALSender to reduce data loss at failover for async commit.	Simon Riggs
	WALSender now woken up after each background flush by WALwriter, avoiding multi-second replication delay for an all-async commit workload. Replication delay reduced from 7s with default settings to 200ms, allowing significantly reduced data loss at failover. Andres Freund and Simon Riggs
2012-06-01	Avoid early reuse of btree pages, causing incorrect query results.	Simon Riggs
	When we allowed read-only transactions to skip assigning XIDs we introduced the possibility that a fully deleted btree page could be reused. This broke the index link sequence which could then lead to indexscans silently returning fewer rows than would have been correct. The actual incidence of silent errors from this is thought to be very low because of the exact workload required and locking pre-conditions. Fix is to remove pages only if index page opaque->btpo.xact precedes RecentGlobalXmin. Noah Misch, reviewed by Simon Riggs
2012-05-31	Stamp 9.0.8.REL9_0_8	Tom Lane

2012-05-31	Translation updates	Peter Eisentraut

2012-05-31	Revert back-branch changes in behavior of age(xid).	Tom Lane
	Per discussion, it does not seem like a good idea to change the behavior of age(xid) in a minor release, even though the old definition causes the function to fail on hot standby slaves. Therefore, revert commit 5829387381d2e4edf84652bb5a712f6185860670 and follow-on commits in the back branches only.
2012-05-31	Update time zone data files to tzdata release 2012c.	Tom Lane
	DST law changes in Antarctica, Armenia, Chile, Cuba, Falkland Islands, Gaza, Haiti, Hebron, Morocco, Syria, Tokelau Islands. Historical corrections for Canada.
2012-05-30	Ignore SECURITY DEFINER and SET attributes for a PL's call handler.	Tom Lane
	It's not very sensible to set such attributes on a handler function; but if one were to do so, fmgr.c went into infinite recursion because it would call fmgr_security_definer instead of the handler function proper. There is no way for fmgr_security_definer to know that it ought to call the handler and not the original function referenced by the FmgrInfo's fn_oid, so it tries to do the latter, causing the whole process to start over again. Ordinarily such misconfiguration of a procedural language's handler could be written off as superuser error. However, because we allow non-superuser database owners to create procedural languages and the handler for such a language becomes owned by the database owner, it is possible for a database owner to crash the backend, which ideally shouldn't be possible without superuser privileges. In 9.2 and up we will adjust things so that the handler functions are always owned by superusers, but in existing branches this is a minor security fix. Problem noted by Noah Misch (after several of us had failed to detect it :-(). This is CVE-2012-2655.
2012-05-30	Expand the allowed range of timezone offsets to +/-15:59:59 from Greenwich.	Tom Lane
	We used to only allow offsets less than +/-13 hours, then it was +/14, then it was +/-15. That's still not good enough though, as per today's bug report from Patric Bechtel. This time I actually looked through the Olson timezone database to find the largest offsets used anywhere. The winners are Asia/Manila, at -15:56:00 until 1844, and America/Metlakatla, at +15:13:42 until 1867. So we'd better allow offsets less than +/-16 hours. Given the history, we are way overdue to have some greppable #define symbols controlling this, so make some ... and also remove an obsolete comment that didn't get fixed the last time. Back-patch to all supported branches.
2012-05-28	Teach AbortOutOfAnyTransaction to clean up partially-started transactions.	Tom Lane
	AbortOutOfAnyTransaction failed to do anything if the state it saw on entry corresponded to failing partway through StartTransaction. I fixed AbortCurrentTransaction to cope with that case way back in commit 60b2444cc3ba037630c9b940c3c9ef01b954b87b, but evidently overlooked that AbortOutOfAnyTransaction should do likewise. Back-patch to all supported branches. It's not clear that this omission has any more-than-cosmetic consequences, but it's also not clear that it doesn't, so back-patching seems the least risky choice.
2012-05-26	Prevent synchronized scanning when systable_beginscan chooses a heapscan.	Tom Lane
	The only interesting-for-performance case wherein we force heapscan here is when we're rebuilding the relcache init file, and the only such case that is likely to be examining a catalog big enough to be syncscanned is RelationBuildTupleDesc. But the early-exit optimization in that code gets broken if we start the scan at a random place within the catalog, so that allowing syncscan is actually a big deoptimization if pg_attribute is large (at least for the normal case where the rows for core system catalogs have never been changed since initdb). Hence, prevent syncscan here. Per my testing pursuant to complaints from Jeff Frost and Greg Sabino Mullane, though neither of them seem to have actually hit this specific problem. Back-patch to 8.3, where syncscan was introduced.
2012-05-25	Fix string truncation to be multibyte-aware in text_name and bpchar_name.	Tom Lane
	Previously, casts to name could generate invalidly-encoded results. Also, make these functions match namein() more exactly, by consistently using palloc0() instead of ad-hoc zeroing code. Back-patch to all supported branches. Karl Schnaitter and Tom Lane
2012-05-25	Use binary search instead of brute-force scan in findNamespace().	Tom Lane
	The previous coding presented a significant bottleneck when dumping databases containing many thousands of schemas, since the total time spent searching would increase roughly as O(N^2) in the number of objects. Noted by Jeff Janes, though I rewrote his proposed patch to use the existing findObjectByOid infrastructure. Since this is a longstanding performance bug, backpatch to all supported versions.
2012-05-22	Ensure that seqscans check for interrupts at least once per page.	Tom Lane
	If a seqscan encounters many consecutive pages containing only dead tuples, it can remain in the loop in heapgettup for a long time, and there was no CHECK_FOR_INTERRUPTS anywhere in that loop. This meant there were real-world situations where a query would be effectively uncancelable for long stretches. Add a check placed to occur once per page, which should be enough to provide reasonable response time without adding any measurable overhead. Report and patch by Merlin Moncure (though I tweaked it a bit). Back-patch to all supported branches.
2012-05-15	Fix bug in to_tsquery().	Heikki Linnakangas
	We were using memcpy() to copy to a possibly overlapping memory region, which is a no-no. Use memmove() instead.
2012-05-13	Fix DROP TABLESPACE to unlink symlink when directory is not there.	Tom Lane
	If the tablespace directory is missing entirely, we allow DROP TABLESPACE to go through, on the grounds that it should be possible to clean up the catalog entry in such a situation. However, we forgot that the pg_tblspc symlink might still be there. We should try to remove the symlink too (but not fail if it's no longer there), since not doing so can lead to weird behavior subsequently, as per report from Michael Nolan. There was some discussion of adding dependency links to prevent DROP TABLESPACE when the catalogs still contain references to the tablespace. That might be worth doing too, but it's an orthogonal question, and in any case wouldn't be back-patchable. Back-patch to 9.0, which is as far back as the logic looks like this. We could possibly do something similar in 8.x, but given the lack of reports I'm not sure it's worth the trouble, and anyway the case could not arise in the form the logic is meant to cover (namely, a post-DROP transaction rollback having resurrected the pg_tablespace entry after some or all of the filesystem infrastructure is gone).
2012-05-12	Ensure backwards compatibility for GetStableLatestTransactionId()	Simon Riggs

2012-05-11	Remove extraneous #include "storage/proc.h"	Simon Riggs

2012-05-11	Ensure age() returns a stable value rather than the latest value	Simon Riggs

2012-05-10	Fix Windows implementation of PGSemaphoreLock.	Tom Lane
	The original coding failed to reset ImmediateInterruptOK before returning, which would potentially allow a subsequent query-cancel interrupt to be accepted at an unsafe point. This is a really nasty bug since it's so hard to predict the consequences, but they could be unpleasant. Also, ensure that signal handlers are serviced before this function returns, even if the semaphore is already set. This should make the behavior more like Unix. Back-patch to all supported versions.
2012-05-09	PL/pgSQL RETURN NEXT was leaking converted tuples, causing	Joe Conway
	out of memory when looping through large numbers of rows. Flag the converted tuples to be freed. Complaint and patch by Joe.
2012-05-09	Avoid xid error from age() function when run on Hot Standby	Simon Riggs

2012-04-27	Fix printing of whole-row Vars at top level of a SELECT targetlist.	Tom Lane
	Normally whole-row Vars are printed as "tabname.". However, that does not work at top level of a targetlist, because per SQL standard the parser will think that the "" should result in column-by-column expansion; which is not at all what a whole-row Var implies. We used to just print the table name in such cases, which works most of the time; but it fails if the table name matches a column name available anywhere in the FROM clause. This could lead for instance to a view being interpreted differently after dump and reload. Adding parentheses doesn't fix it, but there is a reasonably simple kluge we can use instead: attach a no-op cast, so that the "*" isn't syntactically at top level anymore. This makes the printing of such whole-row Vars a lot more consistent with other Vars, and may indeed fix more cases than just the reported one; I'm suspicious that cases involving schema qualification probably didn't work properly before, either. Per bug report and fix proposal from Abbas Butt, though this patch is quite different in detail from his. Back-patch to all supported versions.
2012-04-27	Fix syslogger's rotation disable/re-enable logic.	Tom Lane
	If it fails to open a new log file, the syslogger assumes there's something wrong with its parameters (such as log_directory), and stops attempting automatic time-based or size-based log file rotations. Sending it SIGHUP is supposed to start that up again. However, the original coding for that was really bogus, involving clobbering a couple of GUC variables and hoping that SIGHUP processing would restore them. Get rid of that technique in favor of maintaining a separate flag showing we've turned rotation off. Per report from Mark Kirkwood. Also, the syslogger will automatically attempt to create the log_directory directory if it doesn't exist, but that was only happening at startup. For consistency and ease of use, it should do the same whenever the value of log_directory is changed by SIGHUP. Back-patch to all supported branches.
2012-04-25	Fix edge-case behavior of pg_next_dst_boundary().	Tom Lane
	Due to rather sloppy thinking (on my part, I'm afraid) about the appropriate behavior for boundary conditions, pg_next_dst_boundary() gave undefined, platform-dependent results when the input time is exactly the last recorded DST transition time for the specified time zone, as a result of fetching values one past the end of its data arrays. Change its specification to be that it always finds the next DST boundary after the input time, and adjust code to match that. The sole existing caller, DetermineTimeZoneOffset, doesn't actually care about this distinction, since it always uses a probe time earlier than the instant that it does care about. So it seemed best to me to change the API to make the result=1 and result=0 cases more consistent, specifically to ensure that the "before" outputs always describe the state at the given time, rather than hacking the code to obey the previous API comment exactly. Per bug #6605 from Sergey Burladyan. Back-patch to all supported versions.
2012-04-18	Revert recent commit re positional arguments.	Andrew Dunstan

2012-04-18	Fix copyfuncs/equalfuncs support for ReassignOwnedStmt.	Robert Haas
	Noah Misch
2012-04-17	Don't override arguments set via options with positional arguments.	Andrew Dunstan
	A number of utility programs were rather careless about paremeters that can be set via both an option argument and a positional argument. This leads to results which can violate the Principal Of Least Astonishment. These changes refuse to use positional arguments to override settings that have been made via positional arguments. The changes are backpatched to all live branches.
2012-04-11	Clamp indexscan filter condition cost estimate to be not less than zero.	Tom Lane
	cost_index tries to estimate the per-tuple costs of evaluating filter conditions (a/k/a qpquals) by subtracting the estimated cost of the indexqual conditions from that of the baserestrictinfo conditions. This is correct so long as the indexquals list is a subset of the baserestrictinfo list. However, in the presence of derived indexable conditions it's completely wrong, leading to bogus or even negative scan cost estimates, as seen for example in bug #6579 from Istvan Endredy. In practice the problem isn't severe except in the specific case of a LIKE optimization on a functional index containing a very expensive function. A proper fix for this might change cost estimates by more than people would like for stable branches, so in the back branches let's just clamp the cost difference to be not less than zero. That will at least prevent completely insane behavior, while not changing the results normally.
2012-04-09	Fix an Assert that turns out to be reachable after all.	Tom Lane
	estimate_num_groups() gets unhappy with create table empty(); select * from empty except select * from empty e2; I can't see any actual use-case for such a query (and the table is illegal per SQL spec), but it seems like a good idea that it not cause an assert failure.
2012-04-08	set_stack_base() no longer needs to be called in PostgresMain.	Heikki Linnakangas
	This was a thinko in previous commit. Now that stack base pointer is now set in PostmasterMain and SubPostmasterMain, it doesn't need to be set in PostgresMain anymore.
2012-04-08	Do stack-depth checking in all postmaster children.	Heikki Linnakangas
	We used to only initialize the stack base pointer when starting up a regular backend, not in other processes. In particular, autovacuum workers can run arbitrary user code, and without stack-depth checking, infinite recursion in e.g an index expression will bring down the whole cluster. The comment about PL/Java using set_stack_base() is not yet true. As the code stands, PL/java still modifies the stack_base_ptr variable directly. However, it's been discussed in the PL/Java mailing list that it should be changed to use the function, because PL/Java is currently oblivious to the register stack used on Itanium. There's another issues with PL/Java, namely that the stack base pointer it sets is not really the base of the stack, it could be something close to the bottom of the stack. That's a separate issue that might need some further changes to this code, but that's a different story. Backpatch to all supported releases.
2012-04-06	Fix misleading output from gin_desc().	Tom Lane
	XLOG_GIN_UPDATE_META_PAGE and XLOG_GIN_DELETE_LISTPAGE records were printed with a list link field labeled as "blkno", which was confusing, especially when the link was empty (InvalidBlockNumber). Print the metapage block number instead, since that's what's actually being updated. We could include the link values too as a separate field, but not clear it's worth the trouble. Back-patch to 8.4 where the dubious code was added.
2012-04-04	Fix syslogger to not lose log coherency under high load.	Tom Lane
	The original coding of the syslogger had an arbitrary limit of 20 large messages concurrently in progress, after which it would just punt and dump message fragments to the output file separately. Our ambitions are a bit higher than that now, so allow the data structure to expand as necessary. Reported and patched by Andrew Dunstan; some editing by Tom
2012-03-31	Fix O(N^2) behavior in pg_dump when many objects are in dependency loops.	Tom Lane
	Combining the loop workspace with the record of already-processed objects might have been a cute trick, but it behaves horridly if there are many dependency loops to repair: the time spent in the first step of findLoop() grows as O(N^2). Instead use a separate flag array indexed by dump ID, which we can check in constant time. The length of the workspace array is now never more than the actual length of a dependency chain, which should be reasonably short in all cases of practical interest. The code is noticeably easier to understand this way, too. Per gripe from Mike Roest. Since this is a longstanding performance bug, backpatch to all supported versions.