summaryrefslogtreecommitdiff
path: root/src/backend/optimizer
AgeCommit message (Collapse)Author
2006-09-19Improve usage of effective_cache_size parameter by assuming that all theTom Lane
tables in the query compete for cache space, not just the one we are currently costing an indexscan for. This seems more realistic, and it definitely will help in examples recently exhibited by Stefan Kaltenbrunner. To get the total size of all the tables involved, we must tweak the handling of 'append relations' a bit --- formerly we looked up information about the child tables on-the-fly during set_append_rel_pathlist, but it needs to be done before we start doing any cost estimation, so push it into the add_base_rels_to_query scan.
2006-09-08Put back plan-time check for trying to apply SELECT FOR UPDATE/SHARETom Lane
to a relation on the nullable side of an outer join. I had removed this during the outer join planning rewrite a few months ago ... I think I intended to put it somewhere else, but forgot ...
2006-09-06Change processing of extended-Query mode so that an unnamed statementTom Lane
that has parameters is always planned afresh for each Bind command, treating the parameter values as constants in the planner. This removes the performance penalty formerly often paid for using out-of-line parameters --- with this definition, the planner can do constant folding, LIKE optimization, etc. After a suggestion by Andrew@supernews.
2006-08-28Tweak trivial_subqueryscan() to consider a SubqueryScan's targetlistTom Lane
trivial if it contains either Vars referencing the corresponding subplan columns, or Consts equaling the corresponding subplan columns. This lets the planner eliminate the SubqueryScan in some cases generated by generate_setop_tlist().
2006-08-25Add the ability to create indexes 'concurrently', that is, withoutTom Lane
blocking concurrent writes to the table. Greg Stark, with a little help from Tom Lane.
2006-08-19Suppress subquery pullup/pushdown when a subquery contains volatileTom Lane
functions in its targetlist, to avoid introducing multiple evaluations of volatile functions that textually appear only once. This is a slightly tighter version of Jaime Casanova's recent patch.
2006-08-17Fix an oversight in mergejoin planning: the planner would reject aTom Lane
mergejoin possibility where the inner rel was less well sorted than the outer (ie, it matches some but not all of the merge clauses that can work with the outer), if the inner path in question is also the overall cheapest path for its rel. This is an old bug, but I'm not sure it's worth back-patching, because it's such a corner case. Noted while investigating a test case from Peter Hardman.
2006-08-17Teach convert_subquery_pathkeys() to handle the case where theTom Lane
subquery's pathkey is a RelabelType applied to something that appears in the subquery's output; for example where the subquery returns a varchar Var and the sort order is shown as that Var coerced to text. This comes up because varchar doesn't have its own sort operator. Per example from Peter Hardman.
2006-08-12Tweak SPI_cursor_open to allow INSERT/UPDATE/DELETE RETURNING; this wasTom Lane
merely a matter of fixing the error check, since the underlying Portal infrastructure already handles it. This in turn allows these statements to be used in some existing plpgsql and plperl contexts, such as a plpgsql FOR loop. Also, do some marginal code cleanup in places that were being sloppy about distinguishing SELECT from SELECT INTO.
2006-08-12Add INSERT/UPDATE/DELETE RETURNING, with basic docs and regression tests.Tom Lane
plpgsql support to come later. Along the way, convert execMain's SELECT INTO support into a DestReceiver, in order to eliminate some ugly special cases. Jonah Harris and Tom Lane
2006-08-10Fix UNION/INTERSECT/EXCEPT so that when two inputs being merged haveTom Lane
same data type and same typmod, we show that typmod as the output typmod, rather than generic -1. This responds to several complaints over the past few years about UNIONs unexpectedly dropping length or precision info.
2006-08-05Fix inheritance_planner() to delete dummy subplans from its Append planTom Lane
list, when some of the child rels have been excluded by constraint exclusion. This doesn't save a huge amount of time but it'll save some, and it makes the EXPLAIN output look saner. We already did the equivalent thing in set_append_rel_pathlist(), but not here.
2006-08-05Extend relation_excluded_by_constraints() to check for mutuallyTom Lane
contradictory WHERE-clauses applied to a relation. This makes the GUC variable constraint_exclusion rather inappropriately named, but I've refrained for the moment from renaming it. Per example from Martin Lesser.
2006-08-05Teach predicate_refuted_by() how to do proofs involving NOT-clauses.Tom Lane
This doesn't matter too much for ordinary NOTs, since prepqual.c does its best to get rid of those, but it helps with IS NOT TRUE clauses which the rule rewriter likes to insert. Per example from Martin Lesser.
2006-08-04Teach eval_const_expressions to simplify BooleanTest nodes that haveTom Lane
constant input. Seems worth doing because rule rewriter inserts IS NOT TRUE tests into WHERE clauses.
2006-08-02Add support for multi-row VALUES clauses as part of INSERT statementsJoe Conway
(e.g. "INSERT ... VALUES (...), (...), ...") and elsewhere as allowed by the spec. (e.g. similar to a FROM clause subselect). initdb required. Joe Conway and Tom Lane.
2006-07-31Change the relation_open protocol so that we obtain lock on a relationTom Lane
(table or index) before trying to open its relcache entry. This fixes race conditions in which someone else commits a change to the relation's catalog entries while we are in process of doing relcache load. Problems of that ilk have been reported sporadically for years, but it was not really practical to fix until recently --- for instance, the recent addition of WAL-log support for in-place updates helped. Along the way, remove pg_am.amconcurrent: all AMs are now expected to support concurrent update.
2006-07-27Aggregate functions now support multiple input arguments. I also tookTom Lane
the opportunity to treat COUNT(*) as a zero-argument aggregate instead of the old hack that equated it to COUNT(1); this is materially cleaner (no more weird ANYOID cases) and ought to be at least a tiny bit faster. Original patch by Sergey Koposov; review, documentation, simple regression tests, pg_dump and psql support by moi.
2006-07-26Code review for bigint-LIMIT patch. Fix missed planner dependency,Tom Lane
eliminate unnecessary code, force initdb because stored rules change (limit nodes are now supposed to be int8 not int4 expressions). Update comments and error messages, which still all said 'integer'.
2006-07-26Convert effective_cache_size to an integer, for better integration withPeter Eisentraut
upcoming units feature.
2006-07-26Change LIMIT/OFFSET to use int8Bruce Momjian
Dhanaraj M
2006-07-22In the recent changes to make the planner account better for cacheTom Lane
effects in a nestloop inner indexscan, I had only dealt with plain index scans and the index portion of bitmap scans. But there will be cache benefits for the heap accesses of bitmap scans too, so fix cost_bitmap_heap_scan() to account for that.
2006-07-15Fix some missing inclusions identified with new pgcheckdefines tool.Tom Lane
2006-07-14Remove 576 references of include files that were not needed.Bruce Momjian
2006-07-13More include file adjustments.Bruce Momjian
2006-07-11Alphabetically order reference to include files, "S"-"Z".Bruce Momjian
2006-07-11Alphabetically order reference to include files, "N" - "S".Bruce Momjian
2006-07-11Alphabetically order reference to include files, "G" - "M".Bruce Momjian
2006-07-11Sort reference of include files, "A" - "F".Bruce Momjian
2006-07-01Fix oversight in planning for multiple indexscans driven byTom Lane
ScalarArrayOpExpr index quals: we were estimating the right total number of rows returned, but treating the index-access part of the cost as if a single scan were fetching that many consecutive index tuples. Actually we should treat it as a multiple indexscan, and if there are enough of 'em the Mackert-Lohman discount should kick in.
2006-07-01Revise the planner's handling of "pseudoconstant" WHERE clauses, that isTom Lane
clauses containing no variables and no volatile functions. Such a clause can be used as a one-time qual in a gating Result plan node, to suppress plan execution entirely when it is false. Even when the clause is true, putting it in a gating node wins by avoiding repeated evaluation of the clause. In previous PG releases, query_planner() would do this for pseudoconstant clauses appearing at the top level of the jointree, but there was no ability to generate a gating Result deeper in the plan tree. To fix it, get rid of the special case in query_planner(), and instead process pseudoconstant clauses through the normal RestrictInfo qual distribution mechanism. When a pseudoconstant clause is found attached to a path node in create_plan(), pull it out and generate a gating Result at that point. This requires special-casing pseudoconstants in selectivity estimation and cost_qual_eval, but on the whole it's pretty clean. It probably even makes the planner a bit faster than before for the normal case of no pseudoconstants, since removing pull_constant_clauses saves one useless traversal of the qual tree. Per gripe from Phil Frost.
2006-06-28Improve planner estimates for size of tuple hash tables.Tom Lane
2006-06-16Fix problems with cached tuple descriptors disappearing while still in useTom Lane
by creating a reference-count mechanism, similar to what we did a long time ago for catcache entries. The back branches have an ugly solution involving lots of extra copies, but this way is more efficient. Reference counting is only applied to tupdescs that are actually in caches --- there seems no need to use it for tupdescs that are generated in the executor, since they'll go away during plan shutdown by virtue of being in the per-query memory context. Neil Conway and Tom Lane
2006-06-07Remove "fuzzy comparison" logic in qsort comparison function forTom Lane
choose_bitmap_and(). It was way too fuzzy --- per comment, it was meant to be 1% relative difference, but was actually coded as 0.01 absolute difference, thus causing selectivities of say 0.001 and 0.000000000001 to be treated as equal. I believe this thinko explains Maxim Boguk's recent complaint. While we could change it to a relative test coded like compare_fuzzy_path_costs(), there's a bigger problem here, which is that any fuzziness at all renders the comparison function non-transitive, which could confuse qsort() to the point of delivering completely wrong results. So forget the whole thing and just do an exact comparison.
2006-06-06Make the planner estimate costs for nestloop inner indexscans on the basisTom Lane
that the Mackert-Lohmann formula applies across all the repetitions of the nestloop, not just each scan independently. We use the M-L formula to estimate the number of pages fetched from the index as well as from the table; that isn't what it was designed for, but it seems reasonably applicable anyway. This makes large numbers of repetitions look much cheaper than before, which accords with many reports we've received of overestimation of the cost of a nestloop. Also, change the index access cost model to charge random_page_cost per index leaf page touched, while explicitly not counting anything for access to metapage or upper tree pages. This may all need tweaking after we get some field experience, but in simple tests it seems to be giving saner results than before. The main thing is to get the infrastructure in place to let cost_index() and amcostestimate functions take repeated scans into account at all. Per my recent proposal. Note: this patch changes pg_proc.h, but I did not force initdb because the changes are basically cosmetic --- the system does not look into pg_proc to decide how to call an index amcostestimate function, and there's no way to call such a function from SQL at all.
2006-06-05While making the seq_page_cost changes, I was struck by the fact thatTom Lane
cost_nonsequential_access() is really totally inappropriate for its only remaining use, namely estimating I/O costs in cost_sort(). The routine was designed on the assumption that disk caching might eliminate the need for some re-reads on a random basis, but there's nothing very random in that sense about sort's access pattern --- it'll always be picking up the oldest outputs. If we had a good fix on the effective cache size we might consider charging zero for I/O unless the sort temp file size exceeds it, but that's probably putting much too much faith in the parameter. Instead just drop the logic in favor of a fixed compromise between seq_page_cost and random_page_cost per page of sort I/O.
2006-06-05Add a GUC parameter seq_page_cost, and use that everywhere we formerlyTom Lane
assumed that a sequential page fetch has cost 1.0. This patch doesn't in itself change the system's behavior at all, but it opens the door to people adopting other units of measurement for EXPLAIN costs. Also, if we ever decide it's worth inventing per-tablespace access cost settings, this change provides a workable intellectual framework for that.
2006-05-18Fix choose_bitmap_and() so that partial index predicates are considered whenTom Lane
deciding whether a potential additional indexscan is redundant or not. As now coded, any use of a partial index that was already used in a previous AND arm will be rejected as redundant. This might be overly restrictive, but not considering the point at all is definitely bad, as per example in bug #2441 from Arjen van der Meijden. In particular, a clauseless scan of a partial index was *never* considered redundant by the previous coding, and that's surely wrong. Being more flexible would also require some consideration of how not to double-count the index predicate's selectivity.
2006-05-18When a bitmap indexscan is using a partial index, it is necessary to includeTom Lane
the partial index predicate in the scan's "recheck condition". Otherwise, if the scan becomes lossy for lack of bitmap memory, we would fail to enforce that returned rows satisfy the predicate. Noted while studying bug #2441 from Arjen van der Meijden.
2006-05-18Fix thinko in recent changes to handle ScalarArrayOpExpr as an indexableTom Lane
condition: when there are multiple possible index paths involving ScalarArrayOpExprs, they are logically to be ANDed together not ORed. This thinko was a direct consequence of trying to put the processing inside generate_bitmap_or_paths(), which I now see was a bit too cute. So pull it out and make the callers do it separately (there are only two that need it anyway). Partially responds to bug #2441 from Arjen van der Meijden. There are some additional infelicities exposed by his example, but they are also in 8.1.x, while this mistake is not.
2006-05-03Fix calculation of plan node extParams to account for the possibility that oneTom Lane
initPlan sets a parameter for another. This could not (I think) happen before 8.1, but it's possible now because the initPlans generated by MIN/MAX optimization might themselves use initPlans. We attach those initPlans as siblings of the MIN/MAX ones, not children, to avoid duplicate computation when multiple MIN/MAX aggregates are present; so this leads to the case of an initPlan needing the result of a sibling initPlan, which is not possible with ordinary query nesting. Hadn't been noticed because in most contexts having too much stuff listed in extParam is fairly harmless. Fixes "plan should not reference subplan's variable" bug reported by Catalin Pitis.
2006-05-02Avoid assuming that statistics for a parent relation reflect the properties ofTom Lane
the union of its child relations as well. This might have been a good idea when it was originally coded, but it's a fatally bad idea when inheritance is being used for partitioning. It's better to have no stats at all than completely misleading stats. Per report from Mark Liberman. The bug arguably exists all the way back, but I've only patched HEAD and 8.1 because we weren't particularly trying to support partitioning before 8.1. Eventually we ought to look at deriving union statistics instead of just punting, but for now the drop kick looks good.
2006-04-30Improve the representation of FOR UPDATE/FOR SHARE so that we canTom Lane
support both FOR UPDATE and FOR SHARE in one command, as well as both NOWAIT and normal WAIT behavior. The more general code is actually simpler and cleaner.
2006-04-28Remove the restriction originally coded into optimize_minmax_aggregates() thatTom Lane
MIN/MAX not be converted to use an index if the query WHERE clause contains any volatile functions or subplans. I had originally feared that the conversion might alter the behavior of such a query with respect to a volatile function. Well, so it might, but only in the sense that the function would get evaluated at a subset of the table rows rather than all of them --- and we have never made any such guarantee anyway. (For instance, we don't refuse to use an index for an ordinary non-aggregate query when one of the non-indexable filter conditions contains a volatile function.) The prohibition against subplans was because of worry that that case wasn't adequately tested, which it wasn't, but it turns out to be possible to make 8.1 fail anyway: regression=# select o.ten, (select max(unique2) from tenk1 i where ten = o.ten or ten = (select f1 from int4_tbl limit 1)) from tenk1 o; ERROR: direct correlated subquery unsupported as initplan This is due to bogus code in SS_make_initplan_from_plan (it's an initplan, ergo it can't have any parParams). Having fixed that, we might as well allow subplans as well as initplans.
2006-04-25The 8.1 planner removes WHERE quals from the plan when the quals areTom Lane
implied by the predicate of a partial index being used to scan a table. However, this optimization is unsafe in an UPDATE, DELETE, or SELECT FOR UPDATE query, because the quals need to be rechecked by EvalPlanQual if there's an update conflict. Per example from Jean-Samuel Reynaud.
2006-04-22Simplify ParamListInfo data structure to support only numbered parameters,Tom Lane
not named ones, and replace linear searches of the list with array indexing. The named-parameter support has been dead code for many years anyway, and recent profiling suggests that the searching was costing a noticeable amount of performance for complex queries.
2006-04-09Revert my best_inner_indexscan patch of yesterday, which turns out to haveTom Lane
had a bad side-effect: it stopped finding plans that involved BitmapAnd combinations of indexscans using both join and non-join conditions. Instead, make choose_bitmap_and more aggressive about detecting redundancies between BitmapOr subplans.
2006-04-08Fix best_inner_indexscan to actually enforce that an "inner indexscan" useTom Lane
at least one join condition as an indexqual. Before bitmap indexscans, this oversight didn't really cost much except for redundantly considering the same join paths twice; but as of 8.1 it could result in silly bitmap scans that would do the same BitmapOr twice and then BitmapAnd these together :-(
2006-04-07Fix make_restrictinfo_from_bitmapqual() to preserve AND/OR flatness of itsTom Lane
output, ie, no OR immediately below an OR. Otherwise we get Asserts or wrong answers for cases such as select * from tenk1 a, tenk1 b where (a.ten = b.ten and (a.unique1 = 100 or a.unique1 = 101)) or (a.hundred = b.hundred and a.unique1 = 42); Per report from Rafael Martinez Guerrero.
2006-04-05Fix a bunch of problems with domains by making them use special input functionsTom Lane
that apply the necessary domain constraint checks immediately. This fixes cases where domain constraints went unchecked for statement parameters, PL function local variables and results, etc. We can also eliminate existing special cases for domains in places that had gotten it right, eg COPY. Also, allow domains over domains (base of a domain is another domain type). This almost worked before, but was disallowed because the original patch hadn't gotten it quite right.