summaryrefslogtreecommitdiff
path: root/src/backend/optimizer
AgeCommit message (Collapse)Author
2005-07-03Don't try to constant-fold functions returning RECORD. We were neverTom Lane
able to do this before, but I had tried to make an exception for functions with OUT parameters. Michael Fuhr found one problem with it already, and I found another, which was it didn't work for strict functions with a NULL input. While both of these could be worked around, the probability that there are more gotchas seems high; I think prudence dictates just reverting to the former behavior for now. Accordingly, remove the kluge added to get_expr_result_type() for Michael's case.
2005-07-03Improve outer-join-deduction logic to be able to propagate equalitiesTom Lane
through multiple join clauses.
2005-07-02Teach planner about some cases where a restriction clause can beTom Lane
propagated inside an outer join. In particular, given LEFT JOIN ON (A = B) WHERE A = constant, we cannot conclude that B = constant at the top level (B might be null instead), but we can nonetheless put a restriction B = constant into the quals for B's relation, since no inner-side rows not meeting that condition can contribute to the final result. Similarly, given FULL JOIN USING (J) WHERE J = constant, we can't directly conclude that either input J variable = constant, but it's OK to push such quals into each input rel. Per recent gripe from Kim Bisgaard. Along the way, remove 'valid_everywhere' flag from RestrictInfo, as on closer analysis it was not being used for anything, and was defined backwards anyway.
2005-06-26Add Oracle-compatible GREATEST and LEAST functions. Pavel StehuleTom Lane
2005-06-14The random selection in function linear() could deliver a value equal to maxTom Lane
if geqo_rand() returns exactly 1.0, resulting in failure due to indexing off the end of the pool array. Also, since this is using inexact float math, it seems wise to guard against roundoff error producing values slightly outside the expected range. Per report from bug@zedware.org.
2005-06-14Teach planner to optionally ignore index columns that have an equalityTom Lane
constraint while determining whether the index sort order matches the query's ORDER BY. This for example allows an index on (x,y) to match ... WHERE x = 42 ORDER BY y; It only works for btree indexes, but since those are the only ones we currently have that are ordered at all, that's good enough for now. Per popular demand.
2005-06-13Change the planner to allow indexscan qualification clauses to useTom Lane
nonconsecutive columns of a multicolumn index, as per discussion around mid-May (pghackers thread "Best way to scan on-disk bitmaps"). This turns out to require only minimal changes in btree, and so far as I can see none at all in GiST. btcostestimate did need some work, but its original assumption that index selectivity == heap selectivity was quite bogus even before this.
2005-06-10Separate predicate-testing code out of indxpath.c, making it a moduleTom Lane
in its own right. As proposed by Simon Riggs, but with some editorializing of my own.
2005-06-10Quick hack to allow the outer query's tuple_fraction to be passed downTom Lane
to a subquery if the outer query is simple enough that the LIMIT can be reflected directly to the subquery. This didn't use to be very interesting, because a subquery that couldn't have been flattened into the upper query was usually not going to be very responsive to tuple_fraction anyway. But with new code that allows UNION ALL subqueries to pay attention to tuple_fraction, this is useful to do. In particular this lets the optimization occur when the UNION ALL is directly inside a view.
2005-06-10If a LIMIT is applied to a UNION ALL query, plan each UNION arm asTom Lane
if the limit were directly applied to it. This does not actually add a LIMIT plan node to the generated subqueries --- that would be useless overhead --- but it does cause the planner to prefer fast- start plans when the limit is small. After an idea from Phil Endecott.
2005-06-10Revise searching of subplan target lists to use something more efficientTom Lane
than tlist_member calls. Building a large join tlist is still O(N^2), but with a much smaller constant factor than before.
2005-06-09Simplify the planner's join clause management by storing join clausesTom Lane
of a relation in a flat 'joininfo' list. The former arrangement grouped the join clauses according to the set of unjoined relids used in each; however, profiling on test cases involving lots of joins proves that that data structure is a net loss. It takes more time to group the join clauses together than is saved by avoiding duplicate tests later. It doesn't help any that there are usually not more than one or two clauses per group ...
2005-06-08Marginal hack to avoid spending a lot of time in find_join_rel duringTom Lane
large planning problems: when the list of join rels gets too long, make an auxiliary hash table that hashes on the identifying Bitmapset.
2005-06-06Nab some low-hanging fruit: replace the planner's base_rel_list andTom Lane
other_rel_list with a single array indexed by rangetable index. This reduces find_base_rel from O(N) to O(1) without any real penalty. While find_base_rel isn't one of the major bottlenecks in any profile I've seen so far, it was starting to creep up on the radar screen for complex queries --- so might as well fix it.
2005-06-05Remove planner's private fields from Query struct, and put them intoTom Lane
a new PlannerInfo struct, which is passed around instead of the bare Query in all the planning code. This commit is essentially just a code-beautification exercise, but it does open the door to making larger changes to the planner data structures without having to muck with the widely-known Query struct.
2005-06-04Change expandRTE() and ResolveNew() back to taking just the singleTom Lane
RTE of interest, rather than the whole rangetable list. This makes the API more understandable and avoids duplicate RTE lookups. This patch reverts no-longer-needed portions of my patch of 2004-08-19.
2005-06-03Revise handling of dropped columns in JOIN alias lists to avoid aTom Lane
performance problem pointed out by phil@vodafone: to wit, we were spending O(N^2) time to check dropped-ness in an N-deep join tree, even in the case where the tree was freshly constructed and couldn't possibly mention any dropped columns. Instead of recursing in get_rte_attribute_is_dropped(), change the data structure definition: the joinaliasvars list of a JOIN RTE must have a NULL Const instead of a Var at any position that references a now-dropped column. This costs nothing during normal parse-rewrite-plan path, and instead we have a linear-time update to make when loading a stored rule that might contain now-dropped columns. While at it, move the responsibility for acquring locks on relations referenced by rules into this separate function (which I therefore chose to call AcquireRewriteLocks). This saves effort --- namely, duplicated lock grabs in parser and rewriter --- in the normal path at a cost of one extra non-locked heap_open() in the stored-rule path; seems a good tradeoff. A fringe benefit is that it is now *much* clearer that we acquire lock on relations referenced in rules before we make any rewriter decisions based on their properties. (I don't know of any bug of that ilk, but it wasn't exactly clear before.)
2005-06-03Just noticed that you can't Query-Cancel a long planner run, becauseTom Lane
no part of the planner did CHECK_FOR_INTERRUPTS(). Add one in a suitably strategic spot.
2005-05-30Add support for FUNCTION RTEs to build_physical_tlist(), so that theTom Lane
physical-tlist optimization can be applied to FunctionScan nodes as well as regular tables and SubqueryScans.
2005-05-30Skip eval_const_expressions when the query is such that the expressionTom Lane
would be evaluated only once anyway (ie, it's just a SELECT with no FROM or an INSERT ... VALUES). The planner can't do it any faster than the executor, so no point in an extra copying of the expression tree.
2005-05-24Previous fix for "x FULL JOIN y ON true" failed to handle the caseTom Lane
where there was also a WHERE-clause restriction that applied to the join. The check on restrictlist == NIL is really unnecessary anyway, because select_mergejoin_clauses already checked for and complained about any unmergejoinable join clauses. So just take it out.
2005-05-23Avoid redundant relation lock grabs during planning, and make sureTom Lane
that we acquire a lock on relations added to the query due to inheritance. Formerly, no such lock was held throughout planning, which meant that a schema change could occur to invalidate the plan before it's even been completed.
2005-05-22Teach the planner to remove SubqueryScan nodes from the plan if theyTom Lane
aren't doing anything useful (ie, neither selection nor projection). Also, extend to SubqueryScan the hacks already in place to avoid unnecessary ExecProject calls when the result would just be the same tuple the subquery already delivered. This saves some overhead in UNION and other set operations, as well as avoiding overhead for unflatten-able subqueries. Per example from Sokolov Yura.
2005-05-06For some reason access/tupmacs.h has been #including utils/memutils.h,Tom Lane
which is neither needed by nor related to that header. Remove the bogus inclusion and instead include the header in those C files that actually need it. Also fix unnecessary inclusions and bad inclusion order in tsearch2 files.
2005-04-28Implement sharable row-level locks, and use them for foreign key referencesTom Lane
to eliminate unnecessary deadlocks. This commit adds SELECT ... FOR SHARE paralleling SELECT ... FOR UPDATE. The implementation uses a new SLRU data structure (managed much like pg_subtrans) to represent multiple- transaction-ID sets. When more than one transaction is holding a shared lock on a particular row, we create a MultiXactId representing that set of transactions and store its ID in the row's XMAX. This scheme allows an effectively unlimited number of row locks, just as we did before, while not costing any extra overhead except when a shared lock actually has to be shared. Still TODO: use the regular lock manager to control the grant order when multiple backends are waiting for a row lock. Alvaro Herrera and Tom Lane.
2005-04-25Avoid rechecking lossy operators twice in a bitmap scan plan.Tom Lane
2005-04-25While determining the filter clauses for an index scan (either plainTom Lane
or bitmap), use pred_test to be a little smarter about cases where a filter clause is logically unnecessary. This may be overkill for the plain indexscan case, but it's definitely useful for OR'd bitmap scans.
2005-04-25Replace slightly klugy create_bitmap_restriction() function with aTom Lane
more efficient routine in restrictinfo.c (which can make use of make_restrictinfo_internal).
2005-04-25Remove support for OR'd indexscans internal to a single IndexScan planTom Lane
node, as this behavior is now better done as a bitmap OR indexscan. This allows considerable simplification in nodeIndexscan.c itself as well as several planner modules concerned with indexscan plan generation. Also we can improve the sharing of code between regular and bitmap indexscans, since they are now working with nigh-identical Plan nodes.
2005-04-23Turns out that my recent elimination of the 'redundant' flatten_andors()Tom Lane
code in prepqual.c had a small drawback: the flatten_andors code was able to cope with deeply nested AND/OR structures (like 10000 ORs in a row), whereas eval_const_expressions tends to recurse until it overruns the stack. Revise eval_const_expressions so that it doesn't choke on deeply nested ANDs or ORs.
2005-04-23Teach choose_bitmap_and() to actually be choosy --- that is, try toTom Lane
make some estimate of which available indexes to AND together, rather than blindly taking 'em all. This could probably stand further improvement, but it seems to do OK in simple tests.
2005-04-23Fix bogus EXPLAIN display of rowcount estimates for BitmapAnd andTom Lane
BitmapOr nodes.
2005-04-22First cut at planner support for bitmap index scans. Lots to do yet,Tom Lane
but the code is basically working. Along the way, rewrite the entire approach to processing OR index conditions, and make it work in join cases for the first time ever. orindxpath.c is now basically obsolete, but I left it in for the time being to allow easy comparison testing against the old implementation.
2005-04-21Rethink original decision to use AND/OR Expr nodes to represent bitmapTom Lane
logic operations during planning. Seems cleaner to create two new Path node types, instead --- this avoids duplication of cost-estimation code. Also, create an enable_bitmapscan GUC parameter to control use of bitmap plans.
2005-04-21Install some slightly realistic cost estimation for bitmap index scans.Tom Lane
2005-04-20Don't try to run clauseless index scans on index types that don't supportTom Lane
it. Per report from Marinos Yannikos.
2005-04-19Create executor and planner-backend support for decoupled heap and indexTom Lane
scans, using in-memory tuple ID bitmaps as the intermediary. The planner frontend (path creation and cost estimation) is not there yet, so none of this code can be executed. I have tested it using some hacked planner code that is far too ugly to see the light of day, however. Committing now so that the bulk of the infrastructure changes go in before the tree drifts under me.
2005-04-14Don't try to constant-fold functions returning RECORD, since the optimizerTom Lane
isn't presently set up to pass them an expected tuple descriptor. Bug has been there since 7.3 but was just recently reported by Thomas Hallgren.
2005-04-14Completion of project to use fixed OIDs for all system catalogs andTom Lane
indexes. Replace all heap_openr and index_openr calls by heap_open and index_open. Remove runtime lookups of catalog OID numbers in various places. Remove relcache's support for looking up system catalogs by name. Bulky but mostly very boring patch ...
2005-04-12Fix oversight in MIN/MAX optimization: must not return NULL entriesTom Lane
from index, since the aggregates ignore NULLs.
2005-04-12Add aggsortop column to pg_aggregate, so that MIN/MAX optimization canTom Lane
be supported for all datatypes. Add CREATE AGGREGATE and pg_dump support too. Add specialized min/max aggregates for bpchar, instead of depending on text's min/max, because otherwise the possible use of bpchar indexes cannot be recognized. initdb forced because of catalog changes.
2005-04-11Create the planner mechanism for optimizing simple MIN and MAX queriesTom Lane
into indexscans on matching indexes. For the moment, it only handles int4 and text datatypes; next step is to add a column to pg_aggregate so that all MIN/MAX aggregates can be handled. Per my recent proposal.
2005-04-10Make constant-folding produce sane output for COALESCE(NULL,NULL),Tom Lane
that is a plain NULL and not a COALESCE with no inputs. Fixes crash reported by Michael Williamson.
2005-04-10Split out into a separate function the code in grouping_planner() thatTom Lane
decides whether to use hashed grouping instead of sort-plus-uniq grouping. The function needs an annoyingly large number of parameters, but this still seems like a win for legibility, since it removes over a hundred lines from grouping_planner (which is still too big :-().
2005-04-06Merge Resdom nodes into TargetEntry nodes to simplify code and save aTom Lane
few palloc's. I also chose to eliminate the restype and restypmod fields entirely, since they are redundant with information stored in the node's contained expression; re-examining the expression at need seems simpler and more reliable than trying to keep restype/restypmod up to date. initdb forced due to change in contents of stored rules.
2005-04-04In cost_mergejoin, the early-exit effect should not apply to theTom Lane
outer side of an outer join. Per andrew@supernews.
2005-03-31First phase of OUT-parameters project. We can now define and use SQLTom Lane
functions with OUT parameters. The various PLs still need work, as does pg_dump. Rudimentary docs and regression tests included.
2005-03-29Convert oidvector and int2vector into variable-length arrays. ThisTom Lane
change saves a great deal of space in pg_proc and its primary index, and it eliminates the former requirement that INDEX_MAX_KEYS and FUNC_MAX_ARGS have the same value. INDEX_MAX_KEYS is still embedded in the on-disk representation (because it affects index tuple header size), but FUNC_MAX_ARGS is not. I believe it would now be possible to increase FUNC_MAX_ARGS at little cost, but haven't experimented yet. There are still a lot of vestigial references to FUNC_MAX_ARGS, which I will clean up in a separate pass. However, getting rid of it altogether would require changing the FunctionCallInfoData struct, and I'm not sure I want to buy into that.
2005-03-28Rethink the order of expression preprocessing: eval_const_expressionsTom Lane
really ought to run before canonicalize_qual, because it can now produce forms that canonicalize_qual knows how to improve (eg, NOT clauses). Also, because eval_const_expressions already knows about flattening nested ANDs and ORs into N-argument form, the initial flatten_andors pass in canonicalize_qual is now completely redundant and can be removed. This doesn't save a whole lot of code, but the time and palloc traffic eliminated is a useful gain on large expression trees.
2005-03-27First steps towards index scans with heap access decoupled from indexTom Lane
access: define new index access method functions 'amgetmulti' that can fetch multiple TIDs per call. (The functions exist but are totally untested as yet.) Since I was modifying pg_am anyway, remove the no-longer-needed 'rel' parameter from amcostestimate functions, and also remove the vestigial amowner column that was creating useless work for Alvaro's shared-object-dependencies project. Initdb forced due to changes in pg_am.