user/sven/postgresql.git

Age	Commit message (Collapse)	Author
2012-11-11	Check for stack overflow in transformSetOperationTree().	Tom Lane
	Since transformSetOperationTree() recurses, it can be driven to stack overflow with enough UNION/INTERSECT/EXCEPT clauses in a query. Add a check to ensure it fails cleanly instead of crashing. Per report from Matthew Gerber (though it's not clear whether this is the only thing going wrong for him). Historical note: I think the reasoning behind not putting a check here in the beginning was that the check in transformExpr() ought to be sufficient to guard the whole parser. However, because transformSetOperationTree() recurses all the way to the bottom of the set-operation tree before doing any analysis of the statement's expressions, that check doesn't save it.
2012-11-09	Remove leftover LWLockRelease() call	Alvaro Herrera
	This code was refactored in d5497b95 but an extra LWLockRelease call was left behind. Per report from Erik Rijkers
2012-11-08	XSLT stylesheet: Add slash to directory name	Peter Eisentraut
	Some versions of the XSLT stylesheets don't handle the missing slash correctly (they concatenate directory and file name without the slash). This might never have worked correctly.
2012-11-08	Fix WaitLatch() to return promptly when the requested timeout expires.	Tom Lane
	If the sleep is interrupted by a signal, we must recompute the remaining time to wait; otherwise, a steady stream of non-wait-terminating interrupts could delay return from WaitLatch indefinitely. This has been shown to be a problem for the autovacuum launcher, and there may well be other places now or in the future with similar issues. So we'd better make the function robust, even though this'll add at least one gettimeofday call per wait. Back-patch to 9.2. We might eventually need to fix 9.1 as well, but the code is quite different there, and the usage of WaitLatch in 9.1 is so limited that it's not clearly important to do so. Reported and diagnosed by Jeff Janes, though I rewrote his patch rather heavily.
2012-11-08	Rename ResolveNew() to ReplaceVarsFromTargetList(), and tweak its API.	Tom Lane
	This function currently lacks the option to throw error if the provided targetlist doesn't have any matching entry for a Var to be replaced. Two of the four existing call sites would be better off with an error, as would the usage in the pending auto-updatable-views patch, so it seems past time to extend the API to support that. To do so, replace the "event" parameter (historically of type CmdType, though it was declared plain int) with a special-purpose enum type. It's unclear whether this function might be called by third-party code. Since many C compilers wouldn't warn about a call site continuing to use the old calling convention, rename the function to forcibly break any such code that hasn't been updated. The old name was none too well chosen anyhow.
2012-11-08	Don't trash input list structure in does_not_exist_skipping().	Tom Lane
	The trigger and rule cases need to split up the input name list, but they mustn't corrupt the passed-in data structure, since it could be part of a cached utility-statement parsetree. Per bug #7641.
2012-11-08	Teach pg_basebackup and pg_receivexlog to reply to server keepalives.	Heikki Linnakangas
	Without this, the connection will be killed after timeout if wal_sender_timeout is set in the server. Original patch by Amit Kapila, modified by me to fit recent changes in the code.
2012-11-07	Fix missing inclusions.	Tom Lane
	Some platforms require including <netinet/in.h> and/or <arpa/inet.h> to use htonl() and ntohl(). Per build failure locally.
2012-11-07	Add URLs to document why DLLIMPORT is needed on Windows.	Bruce Momjian
	Per email from Craig Ringer
2012-11-07	Don't try to use a unopened relation	Alvaro Herrera
	Commit 4c9d0901 mistakenly introduced a call to TransferPredicateLocksToHeapRelation() on an index relation that had been closed a few lines above. Moving up an index_open() call that's below is enough to fix the problem. Discovered by me while testing an unrelated patch.
2012-11-07	In pg_upgrade docs, mention using base backup as part of rsync for	Bruce Momjian
	logical replication upgrades. Backpatch to 9.2.
2012-11-07	Make the streaming replication protocol messages architecture-independent.	Heikki Linnakangas
	We used to send structs wrapped in CopyData messages, which works as long as the client and server agree on things like endianess, timestamp format and alignment. That's good enough for running a standby server, which has to run on the same platform anyway, but it's useful for tools like pg_receivexlog to work across platforms. This breaks protocol compatibility of streaming replication, but we never promised that to be compatible across versions, anyway.
2012-11-06	In pg_upgrade, set synchronous_commit=off for the new cluster, to	Bruce Momjian
	improve performance when restoring the schema from the old cluster. Backpatch to 9.2.
2012-11-05	Fix handling of inherited check constraints in ALTER COLUMN TYPE.	Tom Lane
	This case got broken in 8.4 by the addition of an error check that complains if ALTER TABLE ONLY is used on a table that has children. We do use ONLY for this situation, but it's okay because the necessary recursion occurs at a higher level. So we need to have a separate flag to suppress recursion without making the error check. Reported and patched by Pavan Deolasee, with some editorial adjustments by me. Back-patch to 8.4, since this is a regression of functionality that worked in earlier branches.
2012-11-01	Fix typo	Peter Eisentraut

2012-11-01	Fix bogus handling of $(X) (i.e., ".exe") in isolationtester Makefile.	Tom Lane
	I'm not sure why commit 1eb1dde049ccfffc42c80c2bcec14155c58bcc1f seems to have made this start to fail on Cygwin when it never did before --- but nonetheless, the coding was pretty bogus, and unlike the way we handle $(X) anywhere else. Per buildfarm.
2012-11-01	Limit the number of rel sets considered in consider_index_join_outer_rels.	Tom Lane
	In bug #7626, Brian Dunavant exposes a performance problem created by commit 3b8968f25232ad09001bf35ab4cc59f5a501193e: that commit attempted to consider all possible combinations of indexable join clauses, but if said clauses join to enough different relations, there's an exponential increase in the number of outer-relation sets considered. In Brian's example, all the clauses come from the same equivalence class, which means it's redundant to use more than one of them in an indexscan anyway. So we can prevent the problem in this class of cases (which is probably the majority of real examples) by rejecting combinations that would only serve to add a known-redundant clause. But that still leaves us exposed to exponential growth of planning time when the query has a lot of non-equivalence join clauses that are usable with the same index. I chose to prevent such cases by setting an upper limit on the number of relation sets considered, equal to ten times the number of index clauses considered so far. (This sliding limit still allows new relsets to be added on as we move to additional index columns, which is probably more important than considering even more combinations of clauses for the previous column.) This should keep the amount of work done roughly linear rather than exponential in the apparent query complexity. This part of the fix is pretty ad-hoc; but without a clearer idea of real-world cases for which this would result in markedly inferior plans, it's hard to see how to do better.
2012-10-31	Have make never delete intermediate files automatically	Peter Eisentraut
	Several hacks in certain modes already thought this was a bad idea, so just disable it globally.
2012-10-31	Fix erroneous choice of timeline variable, too	Alvaro Herrera

2012-10-31	Document that TCP keepalive settings read as 0 on Unix-socket connections.	Tom Lane
	Per bug #7631 from Rob Johnson. The code is operating as designed, but the docs didn't explain it.
2012-10-31	Fix erroneous choices of segNo variables	Alvaro Herrera
	Commit dfda6eba (which changed segment numbers to use a single 64 bit variable instead of log/seg) introduced a couple of bogus choices of exactly which log segment number variable to use in each case. This is currently pretty harmless; in one place, the bogus number was only being used in an error message for a pretty unlikely condition (failure to fsync a WAL segment file). In the other, it was using a global variable instead of the local variable; but all callsites were passing the value of the global variable anyway. No need to backpatch because that commit is not on earlier branches.
2012-10-31	Fix ALTER EXTENSION / SET SCHEMA	Alvaro Herrera
	In its original conception, it was leaving some objects into the old schema, but without their proper pg_depend entries; this meant that the old schema could be dropped, causing future pg_dump calls to fail on the affected database. This was originally reported by Jeff Frost as #6704; there have been other complaints elsewhere that can probably be traced to this bug. To fix, be more consistent about altering a table's subsidiary objects along the table itself; this requires some restructuring in how tables are relocated when altering an extension -- hence the new AlterTableNamespaceInternal routine which encapsulates it for both the ALTER TABLE and the ALTER EXTENSION cases. There was another bug lurking here, which was unmasked after fixing the previous one: certain objects would be reached twice via the dependency graph, and the second attempt to move them would cause the entire operation to fail. Per discussion, it seems the best fix for this is to do more careful tracking of objects already moved: we now maintain a list of moved objects, to avoid attempting to do it twice for the same object. Authors: Alvaro Herrera, Dimitri Fontaine Reviewed by Tom Lane
2012-10-28	Preserve intermediate .c files in coverage mode	Peter Eisentraut
	The introduction of the .y -> .c pattern rule causes some .c files such as bootparse.c to be considered intermediate files in the .y -> .c -> .o rule chain, which make would automatically delete. But in coverage mode, the processing tools such as genhtml need those files, so mark them as "precious" so that make preserves them.
2012-10-26	Throw error if expiring tuple is again updated or deleted.	Kevin Grittner
	This prevents surprising behavior when a FOR EACH ROW trigger BEFORE UPDATE or BEFORE DELETE directly or indirectly updates or deletes the the old row. Prior to this patch the requested action on the row could be silently ignored while all triggered actions based on the occurence of the requested action could be committed. One example of how this could happen is if the BEFORE DELETE trigger for a "parent" row deleted "children" which had trigger functions to update summary or status data on the parent. This also prevents similar surprising problems if the query has a volatile function which updates a target row while it is already being updated. There are related issues present in FOR UPDATE cursors and READ COMMITTED queries which are not handled by this patch. These issues need further evalution to determine what change, if any, is needed. Where the new error messages are generated, in most cases the best fix will be to move code from the BEFORE trigger to an AFTER trigger. Where this is not feasible, the trigger can avoid the error by re-issuing the triggering statement and returning NULL. Documentation changes will be submitted in a separate patch. Kevin Grittner and Tom Lane with input from Florian Pflug and Robert Haas, based on problems encountered during conversion of Wisconsin Circuit Court trigger logic to plpgsql triggers.
2012-10-26	Prefer actual constants to pseudo-constants in equivalence class machinery.	Tom Lane
	generate_base_implied_equalities_const() should prefer plain Consts over other em_is_const eclass members when choosing the "pivot" value that all the other members will be equated to. This makes it more likely that the generated equalities will be useful in constraint-exclusion proofs. Per report from Rushabh Lathia.
2012-10-26	In pg_dump, dump SEQUENCE SET items in the data not pre-data section.	Tom Lane
	Represent a sequence's current value as a separate TableDataInfo dumpable object, so that it can be dumped within the data section of the archive rather than in pre-data. This fixes an undesirable inconsistency between the meanings of "--data-only" and "--section=data", and also fixes dumping of sequences that are marked as extension configuration tables, as per a report from Marko Kreen back in July. The main cost is that we do one more SQL query per sequence, but that's probably not very meaningful in most databases. Back-patch to 9.1, since it has the extension configuration issue even though not the --section switch.
2012-10-24	Tweak genericcostestimate's fudge factor for index size.	Tom Lane
	To provide some bias against using a large index when a small one would do as well, genericcostestimate adds a "fudge factor", which for a long time was random_page_cost * index_pages/10000. However, this can grow to be the dominant term in indexscan cost estimates when the index involved is large enough, a behavior that was never intended. Change to a ln(1 + n/10000) formulation, which has nearly the same behavior up to a few hundred pages but tails off significantly thereafter. (A log curve seems correct on first principles, since what we're trying to account for here is index descent costs, which are typically logarithmic.) Per bug #7619 from Niko Kiirala. Possibly this change should get back-patched, but I'm hesitant to mess with cost estimates in stable branches.
2012-10-24	When converting a table to a view, remove its system columns.	Tom Lane
	Views should not have any pg_attribute entries for system columns. However, we forgot to remove such entries when converting a table to a view. This could lead to crashes later on, if someone attempted to reference such a column, as reported by Kohei KaiGai. Patch in HEAD only. This bug has been there forever, but in the back branches we will have to defend against existing mis-converted views, so it doesn't seem worthwhile to change the conversion code too.
2012-10-23	Add context info to OAT_POST_CREATE security hook	Alvaro Herrera
	... and have sepgsql use it to determine whether to check permissions during certain operations. Indexes that are being created as a result of REINDEX, for instance, do not need to have their permissions checked; they were already checked when the index was created. Author: KaiGai Kohei, slightly revised by me
2012-10-21	Correct predicate locking for DROP INDEX CONCURRENTLY.	Kevin Grittner
	For the non-concurrent case there is an AccessExclusiveLock lock on both the index and the heap at a time during which no other process is using either, before which the index is maintained and used for scans, and after which the index is no longer used or maintained. Predicate locks can safely be moved from the index to the related heap relation under the protection of these locks. This was done prior to the introductin of DROP INDEX CONCURRENTLY and continues to be done for non-concurrent index drops. For concurrent index drops, the predicate locks must be moved when there are no index scans in progress on that index and no more can subsequently start, and before heap inserts stop maintaining the index. As long as these conditions are guaranteed when the TransferPredicateLocksToHeapRelation() function is called, stronger locks are not needed for correctness. Kevin Grittner based on questions by Tom Lane in reviewing the DROP INDEX CONCURRENTLY patch and in cooperation with Andres Freund and Simon Riggs.
2012-10-20	Fix pg_dump's handling of DROP DATABASE commands in --clean mode.	Tom Lane
	In commit 4317e0246c645f60c39e6572644cff1cb03b4c65, I accidentally broke this behavior while rearranging code to ensure that --create wouldn't affect whether a DATABASE entry gets put into archive-format output. Thus, 9.2 would issue a DROP DATABASE command in --clean mode, which is either useless or dangerous depending on the usage scenario. It should not do that, and no longer does. A bright spot is that this refactoring makes it easy to allow the combination of --clean and --create to work sensibly, ie, emit DROP DATABASE then CREATE DATABASE before reconnecting. Ordinarily we'd consider that a feature addition and not back-patch it, but it seems silly to not include the extra couple of lines required in the 9.2 version of the code. Per report from Guillaume Lelarge, though this is slightly more extensive than his proposed patch.
2012-10-20	Prevent overflow in pgbench's percent-done display.	Tom Lane
	Per Thom Brown.
2012-10-19	Fix UtilityContainsQuery() to handle CREATE TABLE AS EXECUTE correctly.	Tom Lane
	The code seems to have been written to handle the pre-parse-analysis representation, where an ExecuteStmt would appear directly under CreateTableAsStmt. But in reality the function is only run on already-parse-analyzed statements, so there will be a Query node in between. We'd not noticed the bug because the function is generally not used at all except in extended query protocol. Per report from Robert Haas and Rushabh Lathia.
2012-10-19	Fix hash_search to avoid corruption of the hash table on out-of-memory.	Tom Lane
	An out-of-memory error during expand_table() on a palloc-based hash table would leave a partially-initialized entry in the table. This would not be harmful for transient hash tables, since they'd get thrown away anyway at transaction abort. But for long-lived hash tables, such as the relcache hash, this would effectively corrupt the table, leading to crash or other misbehavior later. To fix, rearrange the order of operations so that table enlargement is attempted before we insert a new entry, rather than after adding it to the hash table. Problem discovered by Hitoshi Harada, though this is a bit different from his proposed patch.
2012-10-19	Fix ruleutils to print "INSERT INTO foo DEFAULT VALUES" correctly.	Tom Lane
	Per bug #7615 from Marko Tiikkaja. Apparently nobody ever tried this case before ...
2012-10-19	Fix orphan on cancel of drop index concurrently.	Simon Riggs
	Canceling DROP INDEX CONCURRENTLY during wait could allow an orphaned index to be left behind which could not be dropped. Backpatch to 9.2 Andres Freund, tested by Abhijit Menon-Sen
2012-10-18	Further cleanup of catcache.c ilist changes.	Tom Lane
	Remove useless duplicate initialization of bucket headers, don't use a dlist_mutable_iter in a performance-critical path that doesn't need it, make some other cosmetic changes for consistency's sake.
2012-10-18	Remove unnecessary "head" arguments from some dlist/slist functions.	Tom Lane
	dlist_delete, dlist_insert_after, dlist_insert_before, slist_insert_after do not need access to the list header, and indeed insisting on that negates one of the main advantages of a doubly-linked list. In consequence, revert addition of "cache_bucket" field to CatCTup.
2012-10-18	Code review for inline-list patch.	Tom Lane
	Make foreach macros less syntactically dangerous, and fix some typos in evidently-never-tested ones. Add missing slist_next_node and slist_head_node functions. Fix broken dlist_check code. Assorted comment improvements.
2012-10-18	Use a more portable platform test.	Andrew Dunstan

2012-10-18	Further tweaking of the readfile() function in pg_ctl.	Heikki Linnakangas
	Don't leak a file descriptor if the file is empty or we can't read its size. Expect there to be a newline at the end of the last line, too. If there isn't, ignore anything after the last newline. This makes it a tiny bit more robust in case the file is appended to concurrently, so that we don't return the last line if it hasn't been fully written yet. And this makes the code a bit less obscure, anyway. Per Tom Lane's suggestion. Backpatch to all supported branches.
2012-10-18	Isolation test for DROP INDEX CONCURRENTLY	Simon Riggs
	for recent concurrent changes. Abhijit Menon-Sen
2012-10-18	Re-think guts of DROP INDEX CONCURRENTLY.	Simon Riggs
	Concurrent behaviour was flawed when using a two-step process, so add an additional phase of processing to ensure concurrency for both SELECTs and INSERT/UPDATE/DELETEs. Backpatch to 9.2 Andres Freund, tweaked by me
2012-10-18	Fix planning of non-strict equivalence clauses above outer joins.	Tom Lane
	If a potential equivalence clause references a variable from the nullable side of an outer join, the planner needs to take care that derived clauses are not pushed to below the outer join; else they may use the wrong value for the variable. (The problem arises only with non-strict clauses, since if an upper clause can be proven strict then the outer join will get simplified to a plain join.) The planner attempted to prevent this type of error by checking that potential equivalence clauses aren't outerjoin-delayed as a whole, but actually we have to check each side separately, since the two sides of the clause will get moved around separately if it's treated as an equivalence. Bugs of this type can be demonstrated as far back as 7.4, even though releases before 8.3 had only a very ad-hoc notion of equivalence clauses. In addition, we neglected to account for the possibility that such clauses might have nonempty nullable_relids even when not outerjoin-delayed; so the equivalence-class machinery lacked logic to compute correct nullable_relids values for clauses it constructs. This oversight was harmless before 9.2 because we were only using RestrictInfo.nullable_relids for OR clauses; but as of 9.2 it could result in pushing constructed equivalence clauses to incorrect places. (This accounts for bug #7604 from Bill MacArthur.) Fix the first problem by adding a new test check_equivalence_delay() in distribute_qual_to_rels, and fix the second one by adding code in equivclass.c and called functions to set correct nullable_relids for generated clauses. Although I believe the second part of this is not currently necessary before 9.2, I chose to back-patch it anyway, partly to keep the logic similar across branches and partly because it seems possible we might find other reasons why we need valid values of nullable_relids in the older branches. Add regression tests illustrating these problems. In 9.0 and up, also add test cases checking that we can push constants through outer joins, since we've broken that optimization before and I nearly broke it again with an overly simplistic patch for this problem.
2012-10-18	pg_dump: Output functions deterministically sorted	Alvaro Herrera
	Implementation idea from Tom Lane Author: Joel Jacobson Reviewed by Joachim Wieland
2012-10-18	Revert tests for drop index concurrently.	Simon Riggs

2012-10-18	Add isolation tests for DROP INDEX CONCURRENTLY.	Simon Riggs
	Backpatch to 9.2 to ensure bugs are fixed. Abhijit Menon-Sen
2012-10-17	Close un-owned SMgrRelations at transaction end.	Tom Lane
	If an SMgrRelation is not "owned" by a relcache entry, don't allow it to live past transaction end. This design allows the same SMgrRelation to be used for blind writes of multiple blocks during a transaction, but ensures that we don't hold onto such an SMgrRelation indefinitely. Because an SMgrRelation typically corresponds to open file descriptors at the fd.c level, leaving it open when there's no corresponding relcache entry can mean that we prevent the kernel from reclaiming deleted disk space. (While CacheInvalidateSmgr messages usually fix that, there are cases where they're not issued, such as DROP DATABASE. We might want to add some more sinval messaging for that, but I'd be inclined to keep this type of logic anyway, since allowing VFDs to accumulate indefinitely for blind-written relations doesn't seem like a good idea.) This code replaces a previous attempt towards the same goal that proved to be unreliable. Back-patch to 9.1 where the previous patch was added.
2012-10-17	Revert "Use "transient" files for blind writes, take 2".	Tom Lane
	This reverts commit fba105b1099f4f5fa7283bb17cba6fed2baa8d0c. That approach had problems with the smgr-level state not tracking what we really want to happen, and with the VFD-level state not tracking the smgr-level state very well either. In consequence, it was still possible to hold kernel file descriptors open for long-gone tables (as in recent report from Tore Halset), and yet there were also cases of FDs being closed undesirably soon. A replacement implementation will follow.
2012-10-17	Embedded list interface	Alvaro Herrera
	Provide a common implementation of embedded singly-linked and doubly-linked lists. "Embedded" in the sense that the nodes' next/previous pointers exist within some larger struct; this design choice reduces memory allocation overhead. Most of the implementation uses inlineable functions (where supported), for performance. Some existing uses of both types of lists have been converted to the new code, for demonstration purposes. Other uses can (and probably will) be converted in the future. Since dllist.c is unused after this conversion, it has been removed. Author: Andres Freund Some tweaks by me Reviewed by Tom Lane, Peter Geoghegan