diff options
| author | Robert Haas <rhaas@postgresql.org> | 2025-12-05 11:05:12 -0500 |
|---|---|---|
| committer | Robert Haas <rhaas@postgresql.org> | 2025-12-05 12:00:18 -0500 |
| commit | 014f9a831a320666bf2195949f41710f970c54ad (patch) | |
| tree | 368a7654bbeae5a8ce9733a05782cdf42964f365 /src/backend/optimizer/plan/planner.c | |
| parent | 8f1791c61836d213acbf85d368c8762705ad9d51 (diff) | |
Don't reset the pathlist of partitioned joinrels.
apply_scanjoin_target_to_paths wants to avoid useless work and
platform-specific dependencies by throwing away the path list created
prior to applying the final scan/join target and constructing a whole
new one using the final scan/join target. However, this is only valid
when we'll consider all the same strategies after the pathlist reset
as before.
After resetting the path list, we reconsider Append and MergeAppend
paths with the modified target list; therefore, it's only valid for a
partitioned relation. However, what the previous coding missed is that
it cannot be a partitioned join relation, because that also has paths
that are not Append or MergeAppend paths and will not be reconsidered.
Thus, before this patch, we'd sometimes choose a partitionwise strategy
with a higher total cost than cheapest non-partitionwise strategy,
which is not good.
We had a surprising number of tests cases that were relying on this
bug to work as they did. A big part of the reason for this is that row
counts in regression test cases tend to be low, which brings the cost
of partitionwise and non-partitionwise strategies very close together,
especially for merge joins, where the real and perceived advantages of
a partitionwise approach are minimal. In addition, one test case
included a row-count-inflating join. In such cases, a partitionwise
join can easily be a loser on cost, because the total number of tuples
passing through an Append node is much higher than it is with a
non-partitionwise strategy. That test case is adjusted by adding
additional join clauses to avoid the row count inflation.
Although the failure of the planner to choose the lowest-cost path is a
bug, we generally do not back-patch fixes of this type, because planning
is not an exact science and there is always a possibility that some user
will end up with a plan that has a lower estimated cost but actually
runs more slowly. Hence, no backpatch here, either.
The code change here is exactly what was originally proposed by
Ashutosh, but the changes to the comments and test cases have been
very heavily rewritten by me, helped along by some very useful advice
from Richard Guo.
Reported-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Author: Robert Haas <rhaas@postgresql.org>
Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com>
Reviewed-by: Arne Roland <arne.roland@malkut.net>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Discussion: http://postgr.es/m/CAExHW5toze58+jL-454J3ty11sqJyU13Sz5rJPQZDmASwZgWiA@mail.gmail.com
Diffstat (limited to 'src/backend/optimizer/plan/planner.c')
| -rw-r--r-- | src/backend/optimizer/plan/planner.c | 32 |
1 files changed, 19 insertions, 13 deletions
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 0e78628bf01..fd77334e5fd 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -7906,17 +7906,23 @@ apply_scanjoin_target_to_paths(PlannerInfo *root, check_stack_depth(); /* - * If the rel is partitioned, we want to drop its existing paths and - * generate new ones. This function would still be correct if we kept the - * existing paths: we'd modify them to generate the correct target above - * the partitioning Append, and then they'd compete on cost with paths - * generating the target below the Append. However, in our current cost - * model the latter way is always the same or cheaper cost, so modifying - * the existing paths would just be useless work. Moreover, when the cost - * is the same, varying roundoff errors might sometimes allow an existing - * path to be picked, resulting in undesirable cross-platform plan - * variations. So we drop old paths and thereby force the work to be done - * below the Append, except in the case of a non-parallel-safe target. + * If the rel only has Append and MergeAppend paths, we want to drop its + * existing paths and generate new ones. This function would still be + * correct if we kept the existing paths: we'd modify them to generate the + * correct target above the partitioning Append, and then they'd compete + * on cost with paths generating the target below the Append. However, in + * our current cost model the latter way is always the same or cheaper + * cost, so modifying the existing paths would just be useless work. + * Moreover, when the cost is the same, varying roundoff errors might + * sometimes allow an existing path to be picked, resulting in undesirable + * cross-platform plan variations. So we drop old paths and thereby force + * the work to be done below the Append. + * + * However, there are several cases when this optimization is not safe. If + * the rel isn't partitioned, then none of the paths will be Append or + * MergeAppend paths, so we should definitely not do this. If it is + * parititoned but is a joinrel, it may have Append and MergeAppend paths, + * but it can also have join paths that we can't afford to discard. * * Some care is needed, because we have to allow * generate_useful_gather_paths to see the old partial paths in the next @@ -7924,7 +7930,7 @@ apply_scanjoin_target_to_paths(PlannerInfo *root, * generate_useful_gather_paths to add path(s) to the main list, and * finally zap the partial pathlist. */ - if (rel_is_partitioned) + if (rel_is_partitioned && IS_SIMPLE_REL(rel)) rel->pathlist = NIL; /* @@ -7950,7 +7956,7 @@ apply_scanjoin_target_to_paths(PlannerInfo *root, } /* Finish dropping old paths for a partitioned rel, per comment above */ - if (rel_is_partitioned) + if (rel_is_partitioned && IS_SIMPLE_REL(rel)) rel->partial_pathlist = NIL; /* Extract SRF-free scan/join target. */ |
