summaryrefslogtreecommitdiff
path: root/src/backend/optimizer
AgeCommit message (Collapse)Author
2009-03-10Fix set_subquery_pathlist() to copy the RTE's subquery before it gets mangledTom Lane
by the planning process. This prevents the "failed to locate grouping columns" error recently reported by Dickson Guedes. That happens because planning replaces SubLinks by SubPlans in the subquery's targetlist, and exprTypmod() is smarter about the former than the latter, causing the apparent type of the subquery's output columns to change. This seems to be a deficiency we should fix in exprTypmod(), but that will be a much more invasive patch with possible side-effects elsewhere, so I'll do that only in HEAD. Back-patch to 8.3. Arguably the lack of a copying step is broken/dangerous all the way back, but in the absence of known problems I'll refrain from making the older branches pay the extra cost. (The reason this particular symptom didn't appear before is that exprTypmod() wasn't smart about SubLinks either, until 8.3.)
2009-03-05Teach the planner to support index access methods that only implementTom Lane
amgettuple or only implement amgetbitmap, instead of the former assumption that every AM supports both APIs. Extracted with minor editorialization from Teodor's fast-GIN-insert patch; whatever becomes of that, this seems like a simple and reasonable generalization of the index AM interface spec.
2009-03-05Fix column privilege checking for cases where parent and child have differentTom Lane
attribute numbering. Also, a parent whole-row reference should not require select privilege on child columns that aren't inherited from the parent. Problem diagnosed by KaiGai Kohei, though this isn't exactly his patch.
2009-02-28Shave a few cycles in compare_pathkeys() by checking for pointer-identicalTom Lane
input lists before we grovel through the lists. This doesn't save much, but testing shows that the case of both inputs NIL is common enough that it saves something. And this is used enough to be a hotspot.
2009-02-27Temporarily (I hope) disable flattening of IN/EXISTS sublinks that are withinTom Lane
the ON clause of an outer join. Doing so is semantically correct but results in de-optimizing queries that were structured to take advantage of the sublink style of execution, as seen in recent complaint from Kevin Grittner. Since the user can get the other behavior by reorganizing his query, having the flattening happen automatically is just a convenience, and that doesn't justify breaking existing applications. Eventually it would be nice to re-enable this, but that seems to require a significantly different approach to outer joins in the executor.
2009-02-27Tighten up join ordering rules to account for recent more-careful analysisTom Lane
of the associativity of antijoins. Also improve optimizer/README discussion of outer join ordering rules.
2009-02-27Improve create_unique_path to not be fooled by unrelated clauses that happenTom Lane
to be syntactically part of a semijoin clause. For example given WHERE EXISTS(SELECT ... WHERE upper.var = lower.var AND some-condition) where some-condition is just a restriction on the lower relation, we can use unique-ification on lower.var after having applied some-condition within the scan on lower.
2009-02-25Get rid of the rather fuzzily defined FlattenedSubLink node type in favor ofTom Lane
making pull_up_sublinks() construct a full-blown JoinExpr tree representation of IN/EXISTS SubLinks that it is able to convert to semi or anti joins. This makes pull_up_sublinks() a shade more complex, but the gain in semantic clarity is worth it. I still have more to do in this area to address the previously-discussed problems, but this commit in itself fixes at least one bug in HEAD, as shown by added regression test case.
2009-02-20Simplify overcomplicated (and overly restrictive) test to see whether anTom Lane
IS NULL condition is rendered redundant by detection of an antijoin. If we know that a join is an antijoin, then *any* Var coming out of its righthand side must be NULL, not only the joining column(s). Also, it's still gonna be null after being passed up through higher joins, whether they're outer joins or not. I was misled by a faulty analogy to reduce_outer_joins() in the original coding. But consider select * from a left join b on a.x = b.y where b.y is null and b.z is null; The first IS NULL condition justifies deciding that the join is an antijoin (if the = is strict) and then the second one is just plain redundant.
2009-02-19Improve comments about semijoin implementation strategy, per a questionTom Lane
from Robert Haas.
2009-02-15Teach the planner to treat a partial unique index as proving a variable isTom Lane
unique for a particular query, if the index predicate is satisfied. This requires a bit of reordering of operations so that we check the predicates before doing any selectivity estimates, but shouldn't really cause any noticeable slowdown. Per a comment from Michal Politowski.
2009-02-06Fix cost_mergejoin's failure to adjust for rescanning of non-unique merge joinTom Lane
keys when considering a semi or anti join. This requires estimating the selectivity of the merge qual as though it were a regular inner join condition. To allow caching both that and the real outer-join-aware selectivity, split RestrictInfo.this_selec into two fields. This fixes one of the problems reported by Kevin Grittner.
2009-02-05Fix an old corner-case error in match_unsorted_outer(): don't considerTom Lane
the cheapest-total inner path as a new candidate while truncating the sort key list, if it already matched the full sort key list. This is too much of a corner case to be worth back-patching, since it's unusual for the cheapest total path to be sorted, and anyway no real harm is done (except in JOIN_SEMI/ANTI cases where cost_mergejoin is a bit broken at the moment). But it wasn't behaving as intended, so fix it. Noted while examining a test case from Kevin Grittner. This error doesn't explain his issue, but it does explain why "set enable_seqscan = off" seemed to reproduce it for me.
2009-01-22Support column-level privileges, as required by SQL standard.Tom Lane
Stephen Frost, with help from KaiGai Kohei and others
2009-01-09Arrange for function default arguments to be processed properly in expressionsTom Lane
that are set up for execution with ExecPrepareExpr rather than going through the full planner process. By introducing an explicit notion of "expression planning", this patch also lays a bit of groundwork for maybe someday allowing sub-selects in standalone expressions.
2009-01-07Create a third option named "partition" for constraint_exclusion, and make itTom Lane
the default. This setting enables constraint exclusion checks only for appendrel members (ie, inheritance children and UNION ALL arms), which are the cases in which constraint exclusion is most likely to be useful. Avoiding the overhead for simple queries that are unlikely to benefit should bring the cost down to the point where this is a reasonable default setting. Per today's discussion.
2009-01-06Fix an oversight in the function-default-arguments patch: after adding someTom Lane
default expressions to a function call, eval_const_expressions must recurse on those expressions. Else they don't get simplified, and in particular we fail to insert additional default arguments if any functions needing defaults are in there. Per report from Rushabh Lathia.
2009-01-01Update copyright for 2009.Bruce Momjian
2008-12-31Add some basic support for window frame clauses to the window-functionsTom Lane
patch. This includes the ability to force the frame to cover the whole partition, and the ability to make the frame end exactly on the current row rather than its last ORDER BY peer. Supporting any more of the full SQL frame-clause syntax will require nontrivial hacking on the window aggregate code, so it'll have to wait for 8.5 or beyond.
2008-12-28Support window functions a la SQL:2008.Tom Lane
Hitoshi Harada, with some kibitzing from Heikki and Tom.
2008-12-18Code review for function default parameters patch. Fix numerous problems asTom Lane
per recent discussions. In passing this also fixes a couple of bugs in the previous variadic-parameters patch.
2008-12-08Don't try to optimize EXISTS subqueries with empty FROM-lists: we need toTom Lane
form a join and that case doesn't have anything to join to. (We could probably make it work if we didn't pull up the subquery, but it seems to me that the case isn't worth extra code.) Per report from Greg Stark.
2008-12-01Fix an oversight in the code that makes transitive-equality deductions fromTom Lane
outer join clauses. Given, say, ... from a left join b on a.a1 = b.b1 where a.a1 = 42; we'll deduce a clause b.b1 = 42 and then mark the original join clause redundant (we can't remove it completely for reasons I don't feel like squeezing into this log entry). However the original implementation of that wasn't bulletproof, because clause_selectivity() wouldn't honor this_selec if given nonzero varRelid --- which in practice meant that it worked as desired *except* when considering index scan quals. Which resulted in bogus underestimation of the size of the indexscan result for an inner indexscan in an outer join, and consequently a possibly bad choice of indexscan vs. bitmap scan. Fix by introducing an explicit test into clause_selectivity(). Also, to make sure we don't trigger that test in corner cases, change the convention to be that this_selec > 1, not this_selec = 1, means it's been marked redundant. Per trouble report from Scara Maccai. Back-patch to 8.2, where the problem was introduced.
2008-11-28My recent fix for semijoin planning didn't actually work for a semijoin with aTom Lane
RHS that can't be unique-ified --- join_is_legal has to check that before deciding to build a join, else we'll have an unimplementable joinrel. Per report from Greg Stark.
2008-11-22Switch the planner over to treating qualifications of a JOIN_SEMI join asTom Lane
though it is an inner rather than outer join type. This essentially means that we don't bother to separate "pushed down" qual conditions from actual join quals at a semijoin plan node; which is okay because the restrictions of SQL syntax make it impossible to have a pushed-down qual that references the inner side of a semijoin. This allows noticeably better optimization of IN/EXISTS cases than we had before, since the equivalence-class machinery can now use those quals. Also fix a couple of other mistakes that had essentially disabled the ability to unique-ify the inner relation and then join it to just a subset of the left-hand relations. An example case using the regression database is select * from tenk1 a, tenk1 b where (a.unique1,b.unique2) in (select unique1,unique2 from tenk1 c); which is planned reasonably well by 8.3 and earlier but had been forcing a cartesian join of a/b in CVS HEAD.
2008-11-20Fix breakage of bitmap scan plan creation for special index operators suchTom Lane
as LIKE. I oversimplified this code when removing support for plan-time determination of index operator lossiness back in April --- I had thought create_bitmap_subplan could stop returning two separate lists of qual conditions, but it still must so that we can treat special operators correctly in create_bitmap_scan_plan. Per report from Rushabh Lathia.
2008-11-15Make SELECT FOR UPDATE/SHARE work on inheritance trees, by having the planTom Lane
return the tableoid as well as the ctid for any FOR UPDATE targets that have child tables. All child tables are listed in the ExecRowMark list, but the executor just skips the ones that didn't produce the current row. Curiously, this longstanding restriction doesn't seem to have been documented anywhere; so no doc changes.
2008-11-13Arrange to cache the results of looking up a btree predicate proof comparisonTom Lane
operator. The result depends only on the two input operators and the proof direction (imply or refute), so it's easy to cache. This provides a very large savings in cases such as Sergey Konoplev's long NOT-IN-list example, where predtest spends all its time repeatedly figuring out that the same pair of operators cannot be used to prove anything. (But of course the O(N^2) behavior still catches up with you eventually.) I'm not convinced it buys a whole lot when constraint_exclusion isn't turned on, but it's not a lot of added code so we might as well cache all the time.
2008-11-12In predtest.c, install a limit on the number of branches we will process inTom Lane
AND, OR, or equivalent clauses: if there are too many (more than 100) just exit without proving anything. This ensures that we don't spend O(N^2) time trying (and most likely failing) to prove anything about very long IN lists and similar cases. Also, install a couple of CHECK_FOR_INTERRUPTS calls to ensure that a long proof attempt can be interrupted. Per gripe from Sergey Konoplev. Back-patch the whole patch to 8.2 and just the CHECK_FOR_INTERRUPTS addition to 8.1. (The rest of the patch doesn't apply cleanly, and since 8.1 doesn't show the complained-of behavior anyway, it doesn't seem necessary to work hard on it.)
2008-11-11Ensure that the phrels sets of PlaceHolderVars appearing in an AppendRelInfo'sTom Lane
translated_vars list get updated when pulling up an appendrel member. It's not clear that this really matters at present, since relatively little gets done with the outputs of an appendrel child relation; but it probably will come back to bite us sometime if we leave them with the wrong values.
2008-11-11Get rid of adjust_appendrel_attr_needed(), which has been broken ever sinceTom Lane
we extended the appendrel mechanism to support UNION ALL optimization. The reason nobody noticed was that we are not actually using attr_needed data for appendrel children; hence it seems more reasonable to rip it out than fix it. Back-patch to 8.2 because an Assert failure is possible in corner cases. Per examination of an example from Jim Nasby. In HEAD, also get rid of AppendRelInfo.col_mappings, which is quite inadequate to represent UNION ALL situations; depend entirely on translated_vars instead.
2008-11-02Remove all uses of the deprecated functions heap_formtuple, heap_modifytuple,Tom Lane
and heap_deformtuple in favor of the newer functions heap_form_tuple et al (which do the same things but use bool control flags instead of arbitrary char values). Eliminate the former duplicate coding of these functions, reducing the deprecated functions to mere wrappers around the newer ones. We can't get rid of them entirely because add-on modules probably still contain many instances of the old coding style. Kris Jurka
2008-10-25Be a little smarter about qual handling for semi-joins: a qual that mentionsTom Lane
only the outer side can be pushed down rather than having to be evaluated at the join.
2008-10-22Dept of better ideas: refrain from creating the planner's placeholder_listTom Lane
until vars are distributed to rels during query_planner() startup. We don't really need it before that, and not building it early has some advantages. First, we don't need to put it through the various preprocessing steps, which saves some cycles and eliminates the need for a number of routines to support PlaceHolderInfo nodes at all. Second, this means one less unused plan for any sub-SELECT appearing in a placeholder's expression, since we don't build placeholder_list until after sublink expansion is complete.
2008-10-21Add a concept of "placeholder" variables to the planner. These are variablesTom Lane
that represent some expression that we desire to compute below the top level of the plan, and then let that value "bubble up" as though it were a plain Var (ie, a column value). The immediate application is to allow sub-selects to be flattened even when they are below an outer join and have non-nullable output expressions. Formerly we couldn't flatten because such an expression wouldn't properly go to NULL when evaluated above the outer join. Now, we wrap it in a PlaceHolderVar and arrange for the actual evaluation to occur below the outer join. When the resulting Var bubbles up through the join, it will be set to NULL if necessary, yielding the correct results. This fixes a planner limitation that's existed since 7.1. In future we might want to use this mechanism to re-introduce some form of Hellerstein's "expensive functions" optimization, ie place the evaluation of an expensive function at the most suitable point in the plan tree.
2008-10-17Salvage a little bit of work from a failed patch: simplify and speed upTom Lane
set_rel_width(). The code had been catering for the possibility of different varnos in the relation targetlist, but this is impossible for a base relation (and if it were possible, putting all the widths in the same RelOptInfo would be wrong anyway).
2008-10-09Improve the recently-added code for inlining set-returning functions so thatTom Lane
it can handle functions returning setof record. The case was left undone originally, but it turns out to be simple to fix.
2008-10-07Extend CTE patch to support recursive UNION (ie, without ALL). TheTom Lane
implementation uses an in-memory hash table, so it will poop out for very large recursive results ... but the performance characteristics of a sort-based implementation would be pretty unpleasant too.
2008-10-06When expanding a whole-row Var into a RowExpr during ResolveNew(), attachTom Lane
the column alias names of the RTE referenced by the Var to the RowExpr. This is needed to allow ruleutils.c to correctly deparse FieldSelect nodes referencing such a construct. Per my recent bug report. Adding a field to RowExpr forces initdb (because of stored rules changes) so this solution is not back-patchable; which is unfortunate because 8.2 and 8.3 have this issue. But it only affects EXPLAIN for some pretty odd corner cases, so we can probably live without a solution for the back branches.
2008-10-04Implement SQL-standard WITH clauses, including WITH RECURSIVE.Tom Lane
There are some unimplemented aspects: recursive queries must use UNION ALL (should allow UNION too), and we don't have SEARCH or CYCLE clauses. These might or might not get done for 8.4, but even without them it's a pretty useful feature. There are also a couple of small loose ends and definitional quibbles, which I'll send a memo about to pgsql-hackers shortly. But let's land the patch now so we can get on with other development. Yoshiyuki Asaba, with lots of help from Tatsuo Ishii and Tom Lane
2008-09-12Skip opfamily check in eclass_matches_any_index() when the index isn't aTom Lane
btree. We can't easily tell whether clauses generated from the equivalence class could be used with such an index, so just assume that they might be. This bit of over-optimization prevented use of non-btree indexes for nestloop inner indexscans, in any case where the join uses an equality operator that is also a btree operator --- which in particular is typically true for hash indexes. Noted while trying to test the current hash index patch.
2008-09-09Improve the plan cache invalidation mechanism to make it invalidate plansTom Lane
when user-defined functions used in a plan are modified. Also invalidate plans when schemas, operators, or operator classes are modified; but for these cases we just invalidate everything rather than tracking exact dependencies, since these types of objects seldom change in a production database. Tom Lane; loosely based on a patch by Martin Pihlak.
2008-09-05Fix an oversight in the 8.2 patch that improved mergejoin performance byTom Lane
inserting a materialize node above an inner-side sort node, when the sort is expected to spill to disk. (The materialize protects the sort from having to support mark/restore, allowing it to do its final merge pass on-the-fly.) We neglected to teach cost_mergejoin about that hack, so it was failing to include the materialize's costs in the estimated cost of the mergejoin. The materialize's costs are generally going to be pretty negligible in comparison to the sort's, so this is only a small error and probably not worth back-patching; but it's still wrong. In the similar case where a materialize is inserted to protect an inner-side node that can't do mark/restore at all, it's still true that the materialize should not spill to disk, and so we should cost it cheaply rather than expensively. Noted while thinking about a question from Tom Raney.
2008-09-01Add a bunch of new error location reports to parse-analysis error messages.Tom Lane
There are still some weak spots around JOIN USING and relation alias lists, but most errors reported within backend/parser/ now have locations.
2008-08-28Extend the parser location infrastructure to include a location field inTom Lane
most node types used in expression trees (both before and after parse analysis). This allows us to place an error cursor in many situations where we formerly could not, because the information wasn't available beyond the very first level of parse analysis. There's a fair amount of work still to be done to persuade individual ereport() calls to actually include an error location, but this gets the initdb-forcing part of the work out of the way; and the situation is already markedly better than before for complaints about unimplementable implicit casts, such as CASE and UNION constructs with incompatible alternative data types. Per my proposal of a few days ago.
2008-08-26Teach eval_const_expressions() to simplify an ArrayCoerceExpr to a constantTom Lane
when its input is constant and the element coercion function is immutable (or nonexistent, ie, binary-coercible case). This is an oversight in the 8.3 implementation of ArrayCoerceExpr, and its result is that certain cases involving IN or NOT IN with constants don't get optimized as they should be. Per experimentation with an example from Ow Mun Heng.
2008-08-25Move exprType(), exprTypmod(), expression_tree_walker(), and related routinesTom Lane
into nodes/nodeFuncs, so as to reduce wanton cross-subsystem #includes inside the backend. There's probably more that should be done along this line, but this is a start anyway.
2008-08-22Arrange to convert EXISTS subqueries that are equivalent to hashable INTom Lane
subqueries into the same thing you'd have gotten from IN (except always with unknownEqFalse = true, so as to get the proper semantics for an EXISTS). I believe this fixes the last case within CVS HEAD in which an EXISTS could give worse performance than an equivalent IN subquery. The tricky part of this is that if the upper query probes the EXISTS for only a few rows, the hashing implementation can actually be worse than the default, and therefore we need to make a cost-based decision about which way to use. But at the time when the planner generates plans for subqueries, it doesn't really know how many times the subquery will be executed. The least invasive solution seems to be to generate both plans and postpone the choice until execution. Therefore, in a query that has been optimized this way, EXPLAIN will show two subplans for the EXISTS, of which only one will actually get executed. There is a lot more that could be done based on this infrastructure: in particular it's interesting to consider switching to the hash plan if we start out using the non-hashed plan but find a lot more upper rows going by than we expected. I have therefore left some minor inefficiencies in place, such as initializing both subplans even though we will currently only use one.
2008-08-20Marginal improvement in sublink planning: allow unknownEqFalse optimizationTom Lane
to be used for SubLinks that are underneath a top-level OR clause. Just as at the very top level of WHERE, it's not necessary to be accurate about whether the sublink returns FALSE or NULL, because either result has the same impact on whether the WHERE will succeed.
2008-08-20Fix obsolete comment. It's no longer the case that Param nodes don'tTom Lane
carry typmod.