user/sven/postgresql.git

Age	Commit message (Collapse)	Author
2007-08-31	Apply a band-aid fix for the problem that 8.2 and up completely misestimate	Tom Lane
	the number of rows likely to be produced by a query such as SELECT * FROM t1 LEFT JOIN t2 USING (key) WHERE t2.key IS NULL; What this is doing is selecting for t1 rows with no match in t2, and thus it may produce a significant number of rows even if the t2.key table column contains no nulls at all. 8.2 thinks the table column's null fraction is relevant and thus may estimate no rows out, which results in terrible plans if there are more joins above this one. A proper fix for this will involve passing much more information about the context of a clause to the selectivity estimator functions than we ever have. There's no time left to write such a patch for 8.3, and it wouldn't be back-patchable into 8.2 anyway. Instead, put in an ad-hoc test to defeat the normal table-stats-based estimation when an IS NULL test is evaluated at an outer join, and just use a constant estimate instead --- I went with 0.5 for lack of a better idea. This won't catch every case but it will catch the typical ways of writing such queries, and it seems unlikely to make things worse for other queries.
2007-08-31	Rewrite make_outerjoininfo's construction of min_lefthand and min_righthand	Tom Lane
	sets for outer joins, in the light of bug #3588 and additional thought and experimentation. The original methodology was fatally flawed for nests of more than two outer joins: it got the relationships between adjacent joins right, but didn't always come to the right conclusions about whether a join could be interchanged with one two or more levels below it. This was largely caused by a mistaken idea that we should use the min_lefthand + min_righthand sets of a sub-join as the minimum left or right input set of an upper join when we conclude that the sub-join can't commute with the upper one. If there's a still-lower join that the sub-join can commute with, this method led us to think that that one could commute with the topmost join; which it can't. Another problem (not directly connected to bug #3588) was that make_outerjoininfo's processing-order-dependent method for enforcing outer join identity #3 didn't work right: if we decided that join A could safely commute with lower join B, we dropped all information about sub-joins under B that join A could perhaps not safely commute with, because we removed B's entire min_righthand from A's. To fix, make an explicit computation of all inner join combinations that occur below an outer join, and add to that the full syntactic relsets of any lower outer joins that we determine it can't commute with. This method gives much more direct enforcement of the outer join rearrangement identities, and it turns out not to cost a lot of additional bookkeeping. Thanks to Richard Harris for the bug report and test case.
2007-07-31	Fix a bug in the original implementation of redundant-join-clause removal:	Tom Lane
	clauses in which one side or the other references both sides of the join cannot be removed as redundant, because that expression won't have been constrained below the join. Per report from Sergey Burladyan.
2007-07-24	Fix predicate-proving logic to cope with binary-compatibility cases when	Tom Lane
	checking whether an IS NULL/IS NOT NULL clause is implied or refuted by a strict function. Per example from Dawid Kuroczko. Backpatch to 8.2 since this is arguably a performance bug.
2007-07-18	Fix an old thinko in SS_make_initplan_from_plan, which is used when optimizing	Tom Lane
	a MIN or MAX aggregate call into an indexscan: the initplan is being made at the current query nesting level and so we shouldn't increment query_level. Though usually harmless, this mistake could lead to bogus "plan should not reference subplan's variable" failures on complex queries. Per bug report from David Sanchez i Gregori.
2007-07-12	Fix mistaken Assert in adjust_appendrel_attr_needed, per Greg Stark.	Tom Lane

2007-05-22	Repair planner bug introduced in 8.2 by ability to rearrange outer joins:	Tom Lane
	in cases where a sub-SELECT inserts a WHERE clause between two outer joins, that clause may prevent us from re-ordering the two outer joins. The code was considering only the joins' own ON-conditions in determining reordering safety, which is not good enough. Add a "delay_upper_joins" flag to OuterJoinInfo to flag that we have detected such a clause and higher-level outer joins shouldn't be permitted to commute with this one. (This might seem overly coarse, but given the current rules for OJ reordering, it's sufficient AFAICT.) The failure case is actually pretty narrow: it needs a WHERE clause within the RHS of a left join that checks the RHS of a lower left join, but is not strict for that RHS (else we'd have simplified the lower join to a plain join). Even then no failure will be manifest unless the planner chooses to rearrange the join order. Per bug report from Adam Terrey.
2007-05-22	Fix best_inner_indexscan to return both the cheapest-total-cost and	Tom Lane
	cheapest-startup-cost innerjoin indexscans, and make joinpath.c consider both of these (when different) as the inside of a nestloop join. The original design was based on the assumption that indexscan paths always have negligible startup cost, and so total cost is the only important figure of merit; an assumption that's obviously broken by bitmap indexscans. This oversight could lead to choosing poor plans in cases where fast-start behavior is more important than total cost, such as LIMIT and IN queries. 8.1-vintage brain fade exposed by an example from Chuck D.
2007-05-12	Improve predicate_refuted_by_simple_clause() to handle IS NULL and IS NOT NULL	Tom Lane
	more completely. The motivation for having it understand IS NULL at all was to allow use of "foo IS NULL" as one of the subsets of a partitioning on "foo", but as reported by Aleksander Kmetec, it wasn't really getting the job done. Backpatch to 8.2 since this is arguably a performance bug.
2007-05-01	Fix a thinko in my patch of a couple months ago for bug #3116: it did the	Tom Lane
	wrong thing when inlining polymorphic SQL functions, because it was using the function's declared return type where it should have used the actual result type of the current call. In 8.1 and 8.2 this causes obvious failures even if you don't have assertions turned on; in 8.0 and 7.4 it would only be a problem if the inlined expression were used as an input to a function that did run-time type determination on its inputs. Add a regression test, since this is evidently an under-tested area.
2007-04-17	Rewrite choose_bitmap_and() to make it more robust in the presence of	Tom Lane
	competing alternatives for indexes to use in a bitmap scan. The former coding took estimated selectivity as an overriding factor, causing it to sometimes choose indexes that were much slower to scan than ones with a slightly worse selectivity. It was also too narrow-minded about which combinations of indexes to consider ANDing. The rewrite makes it pay more attention to index scan cost than selectivity; this seems sane since it's impossible to have very bad selectivity with low cost, whereas the reverse isn't true. Also, we now consider each index alone, as well as adding each index to an AND-group led by each prior index, for a total of about O(N^2) rather than O(N) combinations considered. This makes the results much less dependent on the exact order in which the indexes are considered. It's still a lot cheaper than an O(2^N) exhaustive search. A prefilter step eliminates all but the cheapest of those indexes using the same set of WHERE conditions, to keep the effective value of N down in scenarios where the DBA has created lots of partially-redundant indexes.
2007-03-06	Fix oversight in original coding of inline_function(): since	Tom Lane
	check_sql_fn_retval allows binary-compatibility cases, the expression extracted from an inline-able SQL function might have a type that is only binary-compatible with the declared function result type. To avoid possibly changing the semantics of the expression, we should insert a RelabelType node in such cases. This has only been shown to have bad consequences in recent 8.1 and up releases, but I suspect there may be failure cases in the older branches too, so patch it all the way back. Per bug #3116 from Greg Mullane. Along the way, fix an omission in eval_const_expressions_mutator: it failed to copy the relabelformat field when processing a RelabelType. No known observable failures from this, but it definitely isn't intended behavior.
2007-02-16	Adjust the definition of is_pushed_down so that it's always true for INNER	Tom Lane
	JOIN quals, just like WHERE quals, even if they reference every one of the join's relations. Now that we can reorder outer and inner joins, it's possible for such a qual to end up being assigned to an outer join plan node, and we mustn't have it treated as a join qual rather than a filter qual for the node. (If it were, the join could produce null-extended rows that it shouldn't.) Per bug report from Pelle Johansson.
2007-02-16	Fix another problem in 8.2 changes that allowed "one-time" qual conditions to	Tom Lane
	be checked at plan levels below the top; namely, we have to allow for Result nodes inserted just above a nestloop inner indexscan. Should think about using the general Param mechanism to pass down outer-relation variables, but for the moment we need a back-patchable solution. Per report from Phil Frost.
2007-02-16	Restructure code that is responsible for ensuring that clauseless joins are	Tom Lane
	considered when it is necessary to do so because of a join-order restriction (that is, an outer-join or IN-subselect construct). The former coding was a bit ad-hoc and inconsistent, and it missed some cases, as exposed by Mario Weilguni's recent bug report. His specific problem was that an IN could be turned into a "clauseless" join due to constant-propagation removing the IN's joinclause, and if the IN's subselect involved more than one relation and there was more than one such IN linking to the same upper relation, then the only valid join orders involve "bushy" plans but we would fail to consider the specific paths needed to get there. (See the example case added to the join regression test.) On examining the code I wonder if there weren't some other problem cases too; in particular it seems that GEQO was defending against a different set of corner cases than the main planner was. There was also an efficiency problem, in that when we did realize we needed a clauseless join because of an IN, we'd consider clauseless joins against every other relation whether this was sensible or not. It seems a better design is to use the outer-join and in-clause lists as a backup heuristic, just as the rule of joining only where there are joinclauses is a heuristic: we'll join two relations if they have a usable joinclause or this might be necessary to satisfy an outer-join or IN-clause join order restriction. I refactored the code to have just one place considering this instead of three, and made sure that it covered all the cases that any of them had been considering. Backpatch as far as 8.1 (which has only the IN-clause form of the disease). By rights 8.0 and 7.4 should have the bug too, but they accidentally fail to fail, because the joininfo structure used in those releases preserves some memory of there having once been a joinclause between the inner and outer sides of an IN, and so it leads the code in the right direction anyway. I'll be conservative and not touch them.
2007-02-13	Repair bug in 8.2's new logic for planning outer joins: we have to allow joins	Tom Lane
	that overlap an outer join's min_righthand but aren't fully contained in it, to support joining within the RHS after having performed an outer join that can commute with this one. Aside from the direct fix in make_join_rel(), fix has_join_restriction() and GEQO's desirable_join() to consider this possibility. Per report from Ian Harding.
2007-02-06	Fix a performance regression in 8.2: optimization of MIN/MAX into indexscans	Tom Lane
	had stopped working for tables buried inside views or sub-selects. This is because I had gotten rid of the simplify_jointree() preprocessing step, and optimize_minmax_aggregates() wasn't smart enough to deal with a non-canonical FromExpr. Per gripe from Bill Howe.
2007-02-02	Repair insufficiently careful type checking for SQL-language functions:	Tom Lane
	we should check that the function code returns the claimed result datatype every time we parse the function for execution. Formerly, for simple scalar result types we assumed the creation-time check was sufficient, but this fails if the function selects from a table that's been redefined since then, and even more obviously fails if check_function_bodies had been OFF. This is a significant security hole: not only can one trivially crash the backend, but with appropriate misuse of pass-by-reference datatypes it is possible to read out arbitrary locations in the server process's memory, which could allow retrieving database content the user should not be able to see. Our thanks to Jeff Trout for the initial report. Security: CVE-2007-0555
2007-01-28	Repair oversight in creation of "append relations": we should set up	Tom Lane
	rel->tuples as well as rel->rows, since some estimation functions expect both to be valid in every baserel. Per report from Dave Dutcher.
2007-01-08	Tweak joinlist creation to avoid generating useless one-element subproblems	Tom Lane
	when collapsing of JOIN trees is stopped by join_collapse_limit. For instance a list of 11 LEFT JOINs with limit 8 now produces something like ((1 2 3 4 5 6 7 8) 9 10 11 12) instead of (((1 2 3 4 5 6 7 8) (9)) 10 11 12) The latter structure is really only required for a FULL JOIN. Noted while studying an example from Shane Ambler.
2007-01-08	Remove cost_hashjoin's very ancient hack to discourage (once, entirely forbid)	Tom Lane
	hash joins with the estimated-larger relation on the inside. There are several cases where doing that makes perfect sense, and in cases where it doesn't, the regular cost computation really ought to be able to figure that out. Make some marginal tweaks in said computation to try to get results approximating reality a bit better. Per an example from Shane Ambler. Also, fix an oversight in the original patch to add seq_page_cost: the costs of spilling a hash join to disk should be scaled by seq_page_cost.
2006-12-15	Fix some planner bugs exposed by reports from Arjen van der Meijden. These	Tom Lane
	are all in new-in-8.2 logic associated with indexability of ScalarArrayOpExpr (IN-clauses) or amortization of indexscan costs across repeated indexscans on the inside of a nestloop. In particular: Fix some logic errors in the estimation for multiple scans induced by a ScalarArrayOpExpr indexqual. Include a small cost component in bitmap index scans to reflect the costs of manipulating the bitmap itself; this is mainly to prevent a bitmap scan from appearing to have the same cost as a plain indexscan for fetching a single tuple. Also add a per-index-scan-startup CPU cost component; while prior releases were clearly too pessimistic about the cost of repeated indexscans, the original 8.2 coding allowed the cost of an indexscan to effectively go to zero if repeated often enough, which is overly optimistic. Pay some attention to index correlation when estimating costs for a nestloop inner indexscan: this is significant when the plan fetches multiple heap tuples per iteration, since high correlation means those tuples are probably on the same or adjacent heap pages.
2006-12-12	Fix planner to do the right thing when a degenerate outer join (one whose	Tom Lane
	joinclause doesn't use any outer-side vars) requires a "bushy" plan to be created. The normal heuristic to avoid joins with no joinclause has to be overridden in that case. Problem is new in 8.2; before that we forced the outer join order anyway. Per example from Teodor.
2006-12-07	Repair incorrect placement of WHERE clauses when there are multiple,	Tom Lane
	rearrangeable outer joins and the WHERE clause is non-strict and mentions only nullable-side relations. New bug in 8.2, caused by new logic to allow rearranging outer joins. Per bug #2807 from Ross Cohen; thanks to Jeff Davis for producing a usable test case.
2006-12-06	Fix planning of SubLinks to ensure that Vars generated from transformation of	Tom Lane
	a sublink's test expression have the correct vartypmod, rather than defaulting to -1. There's at least one place where this is important because we're expecting these Vars to be exactly equal() to those appearing in the subplan itself. This is a pretty klugy solution --- it would likely be cleaner to change Param nodes to include a typmod field --- but we can't do that in the already-released 8.2 branch. Per bug report from Hubert Fongarnand.
2006-11-11	Suppress a few 'uninitialized variable' warnings that gcc emits only at	Tom Lane
	-O3 or higher (presumably because it inlines more things). Per gripe from Mark Mielke.
2006-11-10	Fix set_joinrel_size_estimates() to estimate outer-join sizes more	Tom Lane
	accurately: we have to distinguish the effects of the join's own ON clauses from the effects of pushed-down clauses. Failing to do so was a quick hack long ago, but it's time to be smarter. Per example from Thomas H.
2006-10-25	expression_tree_walker failed to let walker function see the immediate child	Tom Lane
	node of a SubLink or SubPlan testexpr field. Bug resulted from replacing the old lefthand/exprs list fields with a simple expression field, and not remembering that expression_tree_walker is coded to save a few cycles by recursing directly to self on list fields (on the assumption the walker isn't interested in List nodes per se). On non-list fields it must of course call the walker. Possibly that hack isn't worth the risk of more such bugs, but I'll leave it be for now. Per bug report from James Robinson.
2006-10-24	Fix check for whether a clauseless join has to be forced in the presence of	Tom Lane
	outer joins. Originally it was only looking for overlap of the righthand side of a left join, but we have to do it on the lefthand side too. Per example from Jean-Pierre Pelletier.
2006-10-04	pgindent run for 8.2.	Bruce Momjian

2006-09-28	Fix IS NULL and IS NOT NULL tests on row-valued expressions to conform to	Tom Lane
	the SQL spec, viz IS NULL is true if all the row's fields are null, IS NOT NULL is true if all the row's fields are not null. The former coding got this right for a limited number of cases with IS NULL (ie, those where it could disassemble a ROW constructor at parse time), but was entirely wrong for IS NOT NULL. Per report from Teodor. I desisted from changing the behavior for arrays, since on closer inspection it's not clear that there's any support for that in the SQL spec. This probably needs more consideration.
2006-09-19	Improve usage of effective_cache_size parameter by assuming that all the	Tom Lane
	tables in the query compete for cache space, not just the one we are currently costing an indexscan for. This seems more realistic, and it definitely will help in examples recently exhibited by Stefan Kaltenbrunner. To get the total size of all the tables involved, we must tweak the handling of 'append relations' a bit --- formerly we looked up information about the child tables on-the-fly during set_append_rel_pathlist, but it needs to be done before we start doing any cost estimation, so push it into the add_base_rels_to_query scan.
2006-09-08	Put back plan-time check for trying to apply SELECT FOR UPDATE/SHARE	Tom Lane
	to a relation on the nullable side of an outer join. I had removed this during the outer join planning rewrite a few months ago ... I think I intended to put it somewhere else, but forgot ...
2006-09-06	Change processing of extended-Query mode so that an unnamed statement	Tom Lane
	that has parameters is always planned afresh for each Bind command, treating the parameter values as constants in the planner. This removes the performance penalty formerly often paid for using out-of-line parameters --- with this definition, the planner can do constant folding, LIKE optimization, etc. After a suggestion by Andrew@supernews.
2006-08-28	Tweak trivial_subqueryscan() to consider a SubqueryScan's targetlist	Tom Lane
	trivial if it contains either Vars referencing the corresponding subplan columns, or Consts equaling the corresponding subplan columns. This lets the planner eliminate the SubqueryScan in some cases generated by generate_setop_tlist().
2006-08-25	Add the ability to create indexes 'concurrently', that is, without	Tom Lane
	blocking concurrent writes to the table. Greg Stark, with a little help from Tom Lane.
2006-08-19	Suppress subquery pullup/pushdown when a subquery contains volatile	Tom Lane
	functions in its targetlist, to avoid introducing multiple evaluations of volatile functions that textually appear only once. This is a slightly tighter version of Jaime Casanova's recent patch.
2006-08-17	Fix an oversight in mergejoin planning: the planner would reject a	Tom Lane
	mergejoin possibility where the inner rel was less well sorted than the outer (ie, it matches some but not all of the merge clauses that can work with the outer), if the inner path in question is also the overall cheapest path for its rel. This is an old bug, but I'm not sure it's worth back-patching, because it's such a corner case. Noted while investigating a test case from Peter Hardman.
2006-08-17	Teach convert_subquery_pathkeys() to handle the case where the	Tom Lane
	subquery's pathkey is a RelabelType applied to something that appears in the subquery's output; for example where the subquery returns a varchar Var and the sort order is shown as that Var coerced to text. This comes up because varchar doesn't have its own sort operator. Per example from Peter Hardman.
2006-08-12	Tweak SPI_cursor_open to allow INSERT/UPDATE/DELETE RETURNING; this was	Tom Lane
	merely a matter of fixing the error check, since the underlying Portal infrastructure already handles it. This in turn allows these statements to be used in some existing plpgsql and plperl contexts, such as a plpgsql FOR loop. Also, do some marginal code cleanup in places that were being sloppy about distinguishing SELECT from SELECT INTO.
2006-08-12	Add INSERT/UPDATE/DELETE RETURNING, with basic docs and regression tests.	Tom Lane
	plpgsql support to come later. Along the way, convert execMain's SELECT INTO support into a DestReceiver, in order to eliminate some ugly special cases. Jonah Harris and Tom Lane
2006-08-10	Fix UNION/INTERSECT/EXCEPT so that when two inputs being merged have	Tom Lane
	same data type and same typmod, we show that typmod as the output typmod, rather than generic -1. This responds to several complaints over the past few years about UNIONs unexpectedly dropping length or precision info.
2006-08-05	Fix inheritance_planner() to delete dummy subplans from its Append plan	Tom Lane
	list, when some of the child rels have been excluded by constraint exclusion. This doesn't save a huge amount of time but it'll save some, and it makes the EXPLAIN output look saner. We already did the equivalent thing in set_append_rel_pathlist(), but not here.
2006-08-05	Extend relation_excluded_by_constraints() to check for mutually	Tom Lane
	contradictory WHERE-clauses applied to a relation. This makes the GUC variable constraint_exclusion rather inappropriately named, but I've refrained for the moment from renaming it. Per example from Martin Lesser.
2006-08-05	Teach predicate_refuted_by() how to do proofs involving NOT-clauses.	Tom Lane
	This doesn't matter too much for ordinary NOTs, since prepqual.c does its best to get rid of those, but it helps with IS NOT TRUE clauses which the rule rewriter likes to insert. Per example from Martin Lesser.
2006-08-04	Teach eval_const_expressions to simplify BooleanTest nodes that have	Tom Lane
	constant input. Seems worth doing because rule rewriter inserts IS NOT TRUE tests into WHERE clauses.
2006-08-02	Add support for multi-row VALUES clauses as part of INSERT statements	Joe Conway
	(e.g. "INSERT ... VALUES (...), (...), ...") and elsewhere as allowed by the spec. (e.g. similar to a FROM clause subselect). initdb required. Joe Conway and Tom Lane.
2006-07-31	Change the relation_open protocol so that we obtain lock on a relation	Tom Lane
	(table or index) before trying to open its relcache entry. This fixes race conditions in which someone else commits a change to the relation's catalog entries while we are in process of doing relcache load. Problems of that ilk have been reported sporadically for years, but it was not really practical to fix until recently --- for instance, the recent addition of WAL-log support for in-place updates helped. Along the way, remove pg_am.amconcurrent: all AMs are now expected to support concurrent update.
2006-07-27	Aggregate functions now support multiple input arguments. I also took	Tom Lane
	the opportunity to treat COUNT(*) as a zero-argument aggregate instead of the old hack that equated it to COUNT(1); this is materially cleaner (no more weird ANYOID cases) and ought to be at least a tiny bit faster. Original patch by Sergey Koposov; review, documentation, simple regression tests, pg_dump and psql support by moi.
2006-07-26	Code review for bigint-LIMIT patch. Fix missed planner dependency,	Tom Lane
	eliminate unnecessary code, force initdb because stored rules change (limit nodes are now supposed to be int8 not int4 expressions). Update comments and error messages, which still all said 'integer'.