Age | Commit message (Collapse) | Author |
|
Per buildfarm and local experimentation, bleeding-edge gcc isn't
convinced that the MemSet in reorder_function_arguments() is safe.
Shut it up by adding an explicit check that pronargs isn't negative,
and by changing MemSet to memset. (It appears that either change is
enough to quiet the warning at -O2, but let's do both to be sure.)
|
|
array_get_element and array_get_slice qualify as leakproof, since
they will silently return NULL for bogus subscripts. But
array_set_element and array_set_slice throw errors for such cases,
making them clearly not leakproof. contain_leaked_vars was evidently
written with only the former case in mind, as it gave the wrong answer
for assignment SubscriptingRefs (nee ArrayRefs).
This would be a live security bug, were it not that assignment
SubscriptingRefs can only occur in INSERT and UPDATE target lists,
while we only care about leakproofness for qual expressions; so the
wrong answer can't occur in practice. Still, that's a rather shaky
answer for a security-related question; and maybe in future somebody
will want to ask about leakproofness of a tlist. So it seems wise to
fix and even back-patch this correction.
(We would need some change here anyway for the upcoming
generic-subscripting patch, since extensions might make different
tradeoffs about whether to throw errors. Commit 558d77f20 attempted
to lay groundwork for that by asking check_functions_in_node whether a
SubscriptingRef contains leaky functions; but that idea fails now that
the implementation methods of a SubscriptingRef are not SQL-visible
functions that could be marked leakproof or not.)
Back-patch to 9.6. While 9.5 has the same issue, the code's a bit
different. It seems quite unlikely that we'd introduce any actual bug
in the short time 9.5 has left to live, so the work/risk/reward balance
isn't attractive for changing 9.5.
Discussion: https://postgr.es/m/3143742.1607368115@sss.pgh.pa.us
|
|
If a PlaceHolderVar is to be evaluated at a join relation, but
its value is only needed there and not at higher levels, we neglected
to update the joinrel's direct_lateral_relids to include the PHV's
source rel. This causes problems because join_is_legal() then won't
allow joining the joinrel to the PHV's source rel at all, leading
to "failed to build any N-way joins" planner failures.
Per report from Andreas Seltenreich. Back-patch to 9.5
where the problem originated.
Discussion: https://postgr.es/m/87blfgqa4t.fsf@aurora.ydns.eu
|
|
Previously this code assumed that all IndexScan nodes supported
mark/restore, which is not true since it depends on optional index AM
support functions. This could lead to errors about missing support
functions in rare edge cases of mergejoins with no sort keys, where an
unordered non-btree index scan was placed on the inner path without a
protecting Materialize node. (Normally, the fact that merge join
requires ordered input would avoid this error.)
Backpatch all the way since this bug is ancient.
Per report from Eugen Konkov on irc.
Discussion: https://postgr.es/m/87o8jn50be.fsf@news-spur.riddles.org.uk
|
|
In the planner, it was possible, given an extreme enough case containing a
large number of joins for the number of estimated rows to become infinite.
This could cause problems in initial_cost_mergejoin() where we perform
some calculations based on those row estimates.
A problem case, presented by Onder Kalaci showed an Assert failure from
an Assert checking outerstartsel <= outerendsel. In his test case this
was effectively NaN <= Inf, which is false. The NaN outerstartsel came
from multiplying the infinite outer_path_rows by 0.0.
In master, this problem was fixed by a90c950fc, however, that fix was too
invasive for the backbranches. Here we just relax the Asserts to allow
them to pass. The worst that appears to happen from this is that we show
NaN cost values and infinite row estimates in EXPLAIN. add_path() would
have had a hard time doing anything useful with such costs, but that does
not really matter as if the row estimates were even close to accurate,
such plan would not complete this side of the heat death of the universe.
Reported-by: Onder Kalaci
Backpatch: 9.5 to 13
Discussion: https://postgr.es/m/DM6PR21MB1211FF360183BCA901B27F04D80B0@DM6PR21MB1211.namprd21.prod.outlook.com
|
|
get_eclass_for_sort_expr() computes expr_relids and nullable_relids
early on, even though they won't be needed unless we make a new
EquivalenceClass, which we often don't. Aside from the probably-minor
inefficiency, there's a memory management problem: these bitmapsets will
be built in the caller's context, leading to dangling pointers if that
is shorter-lived than root->planner_cxt. This would be a live bug if
get_eclass_for_sort_expr() could be called with create_it = true during
GEQO join planning. So far as I can find, the core code never does
that, but it's hard to be sure that no extensions do, especially since
the comments make it clear that that's supposed to be a supported case.
Fix by not computing these values until we've switched into planner_cxt
to build the new EquivalenceClass.
generate_join_implied_equalities() uses inner_rel->relids to look up
relevant eclasses, but it ought to be using nominal_inner_relids.
This is presently harmless because a child RelOptInfo will always have
exactly the same eclass_indexes as its topmost parent; but that might
not be true forever, and anyway it makes the code confusing.
The first of these is old (introduced by me in f3b3b8d5b), so back-patch
to all supported branches. The second only dates to v13, but we might
as well back-patch it to keep the code looking similar across branches.
Discussion: https://postgr.es/m/1508010.1601832581@sss.pgh.pa.us
|
|
The trouble with doing this is that an apparently-constant subquery
output column isn't really constant if it is a grouping column that
appears in only some of the grouping sets. A qual using such a
column would be subject to incorrect const-folding after push-down,
as seen in bug #16585 from Paul Sivash.
To fix, just disable qual pushdown altogether if the sub-query has
nonempty groupingSets. While we could imagine far less restrictive
solutions, there is not much point in working harder right now,
because subquery_planner() won't move HAVING clauses to WHERE within
such a subquery. If the qual stays in HAVING it's not going to be
a lot more useful than if we'd kept it at the outer level.
Having said that, this restriction could be removed if we used a
parsetree representation that distinguished such outputs from actual
constants, which is something I hope to do in future. Hence, make
the patch a minimal addition rather than integrating it more tightly
(e.g. by renumbering the existing items in subquery_is_pushdown_safe's
comment).
Back-patch to 9.5 where grouping sets were introduced.
Discussion: https://postgr.es/m/16585-9d8c340d23ade8c1@postgresql.org
|
|
nodeSubplan.c expects that the testexpr for a hashable ANY SubPlan
has the form of one or more OpExprs whose LHS is an expression of the
outer query's, while the RHS is an expression over Params representing
output columns of the subquery. However, the planner only went as far
as verifying that the clauses were all binary OpExprs. This works
99.99% of the time, because the clauses have the right shape when
emitted by the parser --- but it's possible for function inlining to
break that, as reported by PegoraroF10. To fix, teach the planner
to check that the LHS and RHS contain the right things, or more
accurately don't contain the wrong things. Given that this has been
broken for years without anyone noticing, it seems sufficient to just
give up hashing when it happens, rather than go to the trouble of
commuting the clauses back again (which wouldn't necessarily work
anyway).
While poking at that, I also noticed that nodeSubplan.c had a baked-in
assumption that the number of hash clauses is identical to the number
of subquery output columns. Again, that's fine as far as parser output
goes, but it's not hard to break it via function inlining. There seems
little reason for that assumption though --- AFAICS, the only thing
it's buying us is not having to store the number of hash clauses
explicitly. Adding code to the planner to reject such cases would take
more code than getting nodeSubplan.c to cope, so I fixed it that way.
This has been broken for as long as we've had hashable SubPlans,
so back-patch to all supported branches.
Discussion: https://postgr.es/m/1549209182255-0.post@n3.nabble.com
|
|
The qual pushdown logic assumed that all Vars in a restriction clause
must be Vars referencing subquery outputs; but since we introduced
LATERAL, it's possible for such a Var to be a lateral reference instead.
This led to an assertion failure in debug builds. In a non-debug
build, there might be no ill effects (if qual_is_pushdown_safe decided
the qual was unsafe anyway), or we could get failures later due to
construction of an invalid plan. I've not gone to much length to
characterize the possible failures, but at least segfaults in the
executor have been observed.
Given that this has been busted since 9.3 and it took this long for
anybody to notice, I judge that the case isn't worth going to great
lengths to optimize. Hence, fix by just teaching qual_is_pushdown_safe
that such quals are unsafe to push down, matching the previous behavior
when it accidentally didn't fail.
Per report from Tom Ellis. Back-patch to all supported branches.
Discussion: https://postgr.es/m/20200713175124.GQ8220@cloudinit-builder
|
|
After running GetForeignRelSize for a foreign table, adjust rel->tuples
to be at least as large as rel->rows. This prevents bizarre behavior
in estimate_num_groups() and perhaps other places, especially in the
scenario where rel->tuples is zero because pg_class.reltuples is
(suggesting that ANALYZE has never been run for the table). As things
stood, we'd end up estimating one group out of any GROUP BY on such a
table, whereas the default group-count estimate is more likely to result
in a sane plan.
Also, clarify in the documentation that GetForeignRelSize has the option
to override the rel->tuples value if it has a better idea of what to use
than what is in pg_class.reltuples.
Per report from Jeff Janes. Back-patch to all supported branches.
Patch by me; thanks to Etsuro Fujita for review
Discussion: https://postgr.es/m/CAMkU=1xNo9cnan+Npxgz0eK7394xmjmKg-QEm8wYG9P5-CcaqQ@mail.gmail.com
|
|
Reported-by: Amit Langote
Author: Amit Langote
Backpatch-through: 9.6, where it was introduced
Discussion: https://postgr.es/m/CA+HiwqFNADeukaaGRmTqANbed9Fd81gLi08AWe_F86_942Gspw@mail.gmail.com
|
|
|
|
WindowAgg will potentially store large numbers of input rows into
tuplestores to allow access to other rows in the frame. If the input
is coming via an explicit Sort node, then unneeded columns will
already have been discarded (since Sort requests a small tlist); but
there are idioms like COUNT(*) OVER () that result in the input not
being sorted at all, and cases where the input is being sorted by some
means other than a Sort; if we don't request a small tlist, then
WindowAgg's storage requirement is inflated by the unneeded columns.
Backpatch back to 9.6, where the current tlist handling was added.
(Prior to that, WindowAgg would always use a small tlist.)
Discussion: https://postgr.es/m/87a7ator8n.fsf@news-spur.riddles.org.uk
|
|
d4c3a156c added code to remove columns that were not part of a table's
PRIMARY KEY constraint from the GROUP BY clause when all the primary key
columns were present in the group by. This is fine to do since we know
that there will only be one row per group coming from this relation.
However, the logic failed to consider inheritance parent relations. These
can have child relations without a primary key, but even if they did, they
could duplicate one of the parent's rows or one from another child
relation. In this case, those additional GROUP BY columns are required.
Fix this by disabling the optimization for inheritance parent tables.
In v11 and beyond, partitioned tables are fine since partitions cannot
overlap and before v11 partitioned tables could not have a primary key.
Reported-by: Manuel Rigger
Discussion: http://postgr.es/m/CA+u7OA7VLKf_vEr6kLF3MnWSA9LToJYncgpNX2tQ-oWzYCBQAw@mail.gmail.com
Backpatch-through: 9.6
|
|
The logic in reorder_grouping_sets to order grouping set elements to
match a pre-specified sort ordering was defective, resulting in
unnecessary sort nodes (though the query output would still be
correct). Repair, simplifying the code a little, and add a test.
Per report from Richard Guo, though I didn't use their patch. Original
bug seems to have been my fault.
Backpatch back to 9.5 where grouping sets were introduced.
Discussion: https://postgr.es/m/CAN_9JTzyjGcUjiBHxLsgqfk7PkdLGXiM=pwM+=ph2LsWw0WO1A@mail.gmail.com
|
|
create_merge_append_plan failed to honor the CP_EXACT_TLIST flag:
it would generate the expected targetlist but then it felt free to
add resjunk sort targets to it. This demonstrably leads to assertion
failures in v11 and HEAD, and it's probably just accidental that we
don't see the same in older branches. I've not looked into whether
there would be any real-world consequences in non-assert builds.
In HEAD, create_append_plan has sprouted the same problem, so fix
that too (although we do not have any test cases that seem able to
reach that bug). This is an oversight in commit 3fc6e2d7f which
invented the CP_EXACT_TLIST flag, so back-patch to 9.6 where that
came in.
convert_subquery_pathkeys would create pathkeys for subquery output
values if they match any EquivalenceClass known in the outer query
and are available in the subquery's syntactic targetlist. However,
the second part of that condition is wrong, because such values might
not appear in the subquery relation's reltarget list, which would
mean that they couldn't be accessed above the level of the subquery
scan. We must check that they appear in the reltarget list, instead.
This can lead to dropping knowledge about the subquery's sort
ordering, but I believe it's okay, because any sort key that the
outer query actually has any interest in would appear in the
reltarget list.
This second issue is of very long standing, but right now there's no
evidence that it causes observable problems before 9.6, so I refrained
from back-patching further than that. We can revisit that choice if
somebody finds a way to make it cause problems in older branches.
(Developing useful test cases for these issues is really problematic;
fixing convert_subquery_pathkeys removes the only known way to exhibit
the create_merge_append_plan bug, and neither of the test cases added
by this patch causes a problem in all branches, even when considering
the issues separately.)
The second issue explains bug #15795 from Suresh Kumar R ("could not
find pathkey item to sort" with nested DISTINCT queries). I stumbled
across the first issue while investigating that.
Discussion: https://postgr.es/m/15795-fadb56c8e44ee73c@postgresql.org
|
|
join_is_legal() needs to reject forming certain outer joins in cases
where that would lead the planner down a blind alley. However, it
mistakenly supposed that the way to handle full joins was to treat them
as applying the same constraints as for left joins, only to both sides.
That doesn't work, as shown in bug #15741 from Anthony Skorski: given
a lateral reference out of a join that's fully enclosed by a full join,
the code would fail to believe that any join ordering is legal, resulting
in errors like "failed to build any N-way joins".
However, we don't really need to consider full joins at all for this
purpose, because we effectively force them to be evaluated in syntactic
order, and that order is always legal for lateral references. Hence,
get rid of this broken logic for full joins and just ignore them instead.
This seems to have been an oversight in commit 7e19db0c0.
Back-patch to all supported branches, as that was.
Discussion: https://postgr.es/m/15741-276f1f464b3f40eb@postgresql.org
|
|
The assertions added by commits 34ea1ab7f et al found another problem:
set_dummy_rel_pathlist and mark_dummy_rel were failing to label
the dummy paths they create with the correct outer_relids, in case
the relation is necessarily parameterized due to having lateral
references in its tlist. It's likely that this has no user-visible
consequences in production builds, at the moment; but still an assertion
failure is a bad thing, so back-patch the fix.
Per bug #15694 from Roman Zharkov (via Alexander Lakhin)
and an independent report by Tushar Ahuja.
Discussion: https://postgr.es/m/15694-74f2ca97e7044f7f@postgresql.org
Discussion: https://postgr.es/m/7d72ab20-c725-3ce2-f99d-4e64dd8a0de6@enterprisedb.com
|
|
In the case where inheritance_planner() finds that every table has
been excluded by constraints, it thought it could get away with
making a plan consisting of just a dummy Result node. While certainly
there's no updating or deleting to be done, this had two user-visible
problems: the plan did not report the correct set of output columns
when a RETURNING clause was present, and if there were any
statement-level triggers that should be fired, it didn't fire them.
Hence, rather than only generating the dummy Result, we need to
stick a valid ModifyTable node on top, which requires a tad more
effort here.
It's been broken this way for as long as inheritance_planner() has
known about deleting excluded subplans at all (cf commit 635d42e9c),
so back-patch to all supported branches.
Amit Langote and Tom Lane, per a report from Petr Fedorov.
Discussion: https://postgr.es/m/5da6f0f0-1364-1876-6978-907678f89a3e@phystech.edu
|
|
Check ec_relids before bothering to iterate through the EC members.
On a perhaps extreme, but still real-world, query in which
match_eclasses_to_foreign_key_col() accounts for the bulk of the
planner's runtime, this saves nearly 40% of the runtime. It's a bit
of a stopgap fix, but it's simple enough to be back-patched to 9.6
where this code came in; so let's do that.
David Rowley
Discussion: https://postgr.es/m/6970.1545327857@sss.pgh.pa.us
|
|
The previous ordering of these steps satisfied the nominal requirement
that set_rel_pathlist_hook could editorialize on the whole set of Paths
constructed for a base relation. In practice, though, trying to change
the set of partial paths was impossible. Adding one didn't work because
(a) it was too late to be included in Gather paths made by the core code,
and (b) calling add_partial_path after generate_gather_paths is unsafe,
because it might try to delete a path it thinks is dominated, but that
is already embedded in some Gather path(s). Nor could the hook safely
remove partial paths, for the same reason that they might already be
embedded in Gathers.
Better to call extensions first, let them add partial paths as desired,
and then gather. In v11 and up, we already doubled down on that ordering
by postponing gathering even further for single-relation queries; so even
if the hook wished to editorialize on Gather path construction, it could
not.
Report and patch by KaiGai Kohei. Back-patch to 9.6 where Gather paths
were added.
Discussion: https://postgr.es/m/CAOP8fzahwpKJRTVVTqo2AE=mDTz_efVzV6Get_0=U3SO+-ha1A@mail.gmail.com
|
|
As reported in bug #15613 from Srinivasan S A, file_fdw and postgres_fdw
neglected to mark plain baserel foreign paths as parameterized when the
relation has lateral_relids. Other FDWs have surely copied this mistake,
so rather than just patching those two modules, install a band-aid fix
in create_foreignscan_path to rectify the mistake centrally.
Although the band-aid is enough to fix the visible symptom, correct
the calls in file_fdw and postgres_fdw anyway, so that they are valid
examples for external FDWs.
Also, since the band-aid isn't enough to make this work for parameterized
foreign joins, throw an elog(ERROR) if such a case is passed to
create_foreignscan_path. This shouldn't pose much of a problem for
existing external FDWs, since it's likely they aren't trying to make such
paths anyway (though some of them may need a defense against joins with
lateral_relids, similar to the one this patch installs into postgres_fdw).
Add some assertions in relnode.c to catch future occurrences of the same
error --- in particular, as backstop against core-code mistakes like the
one fixed by commit bdd9a99aa.
Discussion: https://postgr.es/m/15613-092be1be9576c728@postgresql.org
|
|
create_lateral_join_info() computes a bunch of information about lateral
references between base relations, and then attempts to propagate those
markings to appendrel children of the original base relations. But the
original coding neglected the possibility of indirect descendants
(grandchildren etc). During v11 development we noticed that this was
wrong for partitioned-table cases, but failed to realize that it was just
as wrong for any appendrel. While the case can't arise for appendrels
derived from traditional table inheritance (because we make a flat
appendrel for that), nested appendrels can arise from nested UNION ALL
subqueries. Failure to mark the lower-level relations as having lateral
references leads to confusion in add_paths_to_append_rel about whether
unparameterized paths can be built. It's not very clear whether that
leads to any user-visible misbehavior; the lack of field reports suggests
that it may cause nothing worse than minor cost misestimation. Still,
it's a bug, and it leads to failures of Asserts that I intend to add
later.
To fix, we need to propagate information from all appendrel parents,
not just those that are RELOPT_BASERELs. We can still do it in one
pass, if we rely on the append_rel_list to be ordered with ancestor
relationships before descendant ones; add assertions checking that.
While fixing this, we can make a small performance improvement by
traversing the append_rel_list just once instead of separately for
each appendrel parent relation.
Noted while investigating bug #15613, though this patch does not fix
that (which is why I'm not committing the related Asserts yet).
Discussion: https://postgr.es/m/3951.1549403812@sss.pgh.pa.us
|
|
Up to now, createplan.c attempted to share PARAM_EXEC slots for
NestLoopParams across different plan levels, if the same underlying Var
was being fed down to different righthand-side subplan trees by different
NestLoops. This was, I think, more of an artifact of using subselect.c's
PlannerParamItem infrastructure than an explicit design goal, but anyway
that was the end result.
This works well enough as long as the plan tree is executing synchronously,
but the feature whereby Gather can execute the parallelized subplan locally
breaks it. An upper NestLoop node might execute for a row retrieved from
a parallel worker, and assign a value for a PARAM_EXEC slot from that row,
while the leader's copy of the parallelized subplan is suspended with a
different active value of the row the Var comes from. When control
eventually returns to the leader's subplan, it gets the wrong answers if
the same PARAM_EXEC slot is being used within the subplan, as reported
in bug #15577 from Bartosz Polnik.
This is pretty reminiscent of the problem fixed in commit 46c508fbc, and
the proper fix seems to be the same: don't try to share PARAM_EXEC slots
across different levels of controlling NestLoop nodes.
This requires decoupling NestLoopParam handling from PlannerParamItem
handling, although the logic remains somewhat similar. To avoid bizarre
division of labor between subselect.c and createplan.c, I decided to move
all the param-slot-assignment logic for both cases out of those files
and put it into a new file paramassign.c. Hopefully it's a bit better
documented now, too.
A regression test case for this might be nice, but we don't know a
test case that triggers the problem with a suitably small amount
of data.
Back-patch to 9.6 where we added Gather nodes. It's conceivable that
related problems exist in older branches; but without some evidence
for that, I'll leave the older branches alone.
Discussion: https://postgr.es/m/15577-ca61ab18904af852@postgresql.org
|
|
MinMaxExpr invokes the btree comparison function for its input datatype,
so it's only leakproof if that function is. Many such functions are
indeed leakproof, but others are not, and we should not just assume that
they are. Hence, adjust contain_leaked_vars to verify the leakproofness
of the referenced function explicitly.
I didn't add a regression test because it would need to depend on
some particular comparison function being leaky, and that's a moving
target, per discussion.
This has been wrong all along, so back-patch to supported branches.
Discussion: https://postgr.es/m/31042.1546194242@sss.pgh.pa.us
|
|
"rescanratio" was computed as 1 + rescanned-tuples / total-inner-tuples,
which is sensible if it's to be multiplied by total-inner-tuples or a cost
value corresponding to scanning all the inner tuples. But in reality it
was (mostly) multiplied by inner_rows or a related cost, numbers that take
into account the possibility of stopping short of scanning the whole inner
relation thanks to a limited key range in the outer relation. This'd
still make sense if we could expect that stopping short would result in a
proportional decrease in the number of tuples that have to be rescanned.
It does not, however. The argument that establishes the validity of our
estimate for that number is independent of whether we scan all of the inner
relation or stop short, and experimentation also shows that stopping short
doesn't reduce the number of rescanned tuples. So the correct calculation
is 1 + rescanned-tuples / inner_rows, and we should be sure to multiply
that by inner_rows or a corresponding cost value.
Most of the time this doesn't make much difference, but if we have
both a high rescan rate (due to lots of duplicate values) and an outer
key range much smaller than the inner key range, then the error can
be significant, leading to a large underestimate of the cost associated
with rescanning.
Per report from Vijaykumar Jain. This thinko appears to go all the way
back to the introduction of the rescan estimation logic in commit
70fba7043, so back-patch to all supported branches.
Discussion: https://postgr.es/m/CAE7uO5hMb_TZYJcZmLAgO6iD68AkEK6qCe7i=vZUkCpoKns+EQ@mail.gmail.com
|
|
postgres_fdw's postgresGetForeignPlan() assumes without checking that the
outer_plan it's given for a join relation must have a NestLoop, MergeJoin,
or HashJoin node at the top. That's been wrong at least since commit
4bbf6edfb (which could cause insertion of a Sort node on top) and it seems
like a pretty unsafe thing to Just Assume even without that.
Through blind good fortune, this doesn't seem to have any worse
consequences today than strange EXPLAIN output, but it's clearly trouble
waiting to happen.
To fix, test the node type explicitly before touching Join-specific
fields, and avoid jamming the new tlist into a node type that can't
do projection. Export a new support function from createplan.c
to avoid building low-level knowledge about the latter into FDWs.
Back-patch to 9.6 where the faulty coding was added. Note that the
associated regression test cases don't show any changes before v11,
apparently because the tests back-patched with 4bbf6edfb don't actually
exercise the problem case before then (there's no top-level Sort
in those plans).
Discussion: https://postgr.es/m/8946.1544644803@sss.pgh.pa.us
|
|
Our support for multiple-set-clauses in UPDATE assumes that the Params
referencing a MULTIEXPR_SUBLINK SubPlan will appear before that SubPlan
in the targetlist of the plan node that calculates the updated row.
(Yeah, it's a hack...) In some PG branches it's possible that a Result
node gets inserted between the primary calculation of the update tlist
and the ModifyTable node. setrefs.c did the wrong thing in this case
and left the upper-level Params as Params, causing a crash at runtime.
What it should do is replace them with "outer" Vars referencing the child
plan node's output. That's a result of careless ordering of operations
in fix_upper_expr_mutator, so we can fix it just by reordering the code.
Fix fix_join_expr_mutator similarly for consistency, even though join
nodes could never appear in such a context. (In general, it seems
likely to be a bit cheaper to use Vars than Params in such situations
anyway, so this patch might offer a tiny performance improvement.)
The hazard extends back to 9.5 where the MULTIEXPR_SUBLINK stuff
was introduced, so back-patch that far. However, this may be a live
bug only in 9.6.x and 10.x, as the other branches don't seem to want
to calculate the final tlist below the Result node. (That plan shape
change between branches might be a mini-bug in itself, but I'm not
really interested in digging into the reasons for that right now.
Still, add a regression test memorializing what we expect there,
so we'll notice if it changes again.)
Per bug report from Eduards Bezverhijs.
Discussion: https://postgr.es/m/b6cd572a-3e44-8785-75e9-c512a5a17a73@tieto.com
|
|
classify_index_clause_usage() is O(N^2) in the number of distinct index
qual clauses it considers, because of its use of a simple search list to
store them. For nearly all queries, that's fine because only a few clauses
will be considered. But Alexander Kuzmenkov reported a machine-generated
query with 80000 (!) index qual clauses, which caused this code to take
forever. Somewhat remarkably, this is the only O(N^2) behavior we now
have for such a query, so let's fix it.
We can get rid of the O(N^2) runtime for cases like this without much
damage to the functionality of choose_bitmap_and() by separating out
paths with "too many" qual or pred clauses, and deeming them to always
be nonredundant with other paths. Then their clauses needn't go into
the search list, so it doesn't get too long, but we don't lose the
ability to consider bitmap AND plans altogether. I set the threshold
for "too many" to be 100 clauses per path, which should be plenty to
ensure no change in planning behavior for normal queries.
There are other things we could do to make this go faster, but it's not
clear that it's worth any additional effort. 80000 qual clauses require
a whole lot of work in many other places, too.
The code's been like this for a long time, so back-patch to all supported
branches. The troublesome query only works back to 9.5 (in 9.4 it fails
with stack overflow in the parser); so I'm not sure that fixing this in
9.4 has any real-world benefit, but perhaps it does.
Discussion: https://postgr.es/m/90c5bdfa-d633-dabe-9889-3cf3e1acd443@postgrespro.ru
|
|
Allowing sub-select containing LIMIT/OFFSET in workers can lead to
inconsistent results at the top-level as there is no guarantee that the
row order will be fully deterministic. The fix is to prohibit pushing
LIMIT/OFFSET within sub-selects to workers.
Reported-by: Andrew Fletcher
Bug: 15324
Author: Amit Kapila
Reviewed-by: Dilip Kumar
Backpatch-through: 9.6
Discussion: https://postgr.es/m/153417684333.10284.11356259990921828616@wrigleys.postgresql.org
|
|
Back-patch commit 655393a022bd653e2b48dbf20b69236981e35195 to 9.6. This
synchronizes the relavant code and helps in generating parallel paths in
some cases in 9.6. This also helps in back-patch of future patches where
we can get the same plan in all branches.
Discussion: https://postgr.es/m/CAA4eK1+XK_875cJA1HPVpx9C7C8Fp7i4QzLJ17T3igfU2iadxQ@mail.gmail.com
|
|
workers.
Allowing window function calculation in workers leads to inconsistent
results because if the input row ordering is not fully deterministic, the
output of window functions might vary across workers. The fix is to treat
them as parallel-restricted.
In the passing, improve the coding pattern in max_parallel_hazard_walker
so that it has a chain of mutually-exclusive if ... else if ... else if
... else if ... IsA tests.
Reported-by: Marko Tiikkaja
Bug: 15324
Author: Amit Kapila
Reviewed-by: Tom Lane
Backpatch-through: 9.6
Discussion: https://postgr.es/m/CAL9smLAnfPJCDUUG4ckX2iznj53V7VSMsYefzZieN93YxTNOcw@mail.gmail.com
|
|
GatherMergePath (introduced in 10) and CustomPath (introduced in 9.5)
have gone missing. The order of the Path nodes was inconsistent with
what is listed in nodes.h, so make the order consistent at the same time
to ease future checks and additions.
Author: Sawada Masahiko
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CAD21AoBQMLoc=ohH-oocuAPsELrmk8_EsRJjOyR8FQLZkbE0wA@mail.gmail.com
|
|
We should only run apply_pathtarget_labeling_to_tlist if CP_LABEL_TLIST
was specified, because only in that case has use_physical_tlist checked
that the labeling will succeed; otherwise we may get an "ORDER/GROUP BY
expression not found in targetlist" error. (This subsumes the previous
test about gating_clauses, because we reset "flags" to zero earlier
if there are gating clauses to apply.)
The only known case in which a failure can occur is with a ProjectSet
path directly atop a table scan path, although it seems likely that there
are other cases or will be such in future. This means that the failure
is currently only visible in the v10 branch: 9.6 didn't have ProjectSet,
while in v11 and HEAD, apply_scanjoin_target_to_paths for some weird
reason is using create_projection_path not apply_projection_to_path,
masking the problem because there's a ProjectionPath in between.
Nonetheless this code is clearly wrong on its own terms, so back-patch
to 9.6 where this logic was introduced.
Per report from Regina Obe.
Discussion: https://postgr.es/m/001501d40f88$75186950$5f493bf0$@pcorp.us
|
|
canonicalize_ec_expression() is supposed to agree with coerce_type() as to
whether a RelabelType should be inserted to make a subexpression be valid
input for the operators of a given opclass. However, it did the wrong
thing with named-composite-type inputs to record_eq(): it put in a
RelabelType to RECORDOID, which the parser doesn't. In some cases this was
harmless because all code paths involving a particular equivalence class
did the same thing, but in other cases this would result in failing to
recognize a composite-type expression as being a member of an equivalence
class that it actually is a member of. The most obvious bad effect was to
fail to recognize that an index on a composite column could provide the
sort order needed for a mergejoin on that column, as reported by Teodor
Sigaev. I think there might be other, subtler, cases that result in
misoptimization. It also seems possible that an unwanted RelabelType
would sometimes get into an emitted plan --- but because record_eq and
friends don't examine the declared type of their input expressions, that
would not create any visible problems.
To fix, just treat RECORDOID as if it were a polymorphic type, which in
some sense it is. We might want to consider formalizing that a bit more
someday, but for the moment this seems to be the only place where an
IsPolymorphicType() test ought to include RECORDOID as well.
This has been broken for a long time, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/a6b22369-e3bf-4d49-f59d-0c41d3551e81@sigaev.ru
|
|
On further reflection, commit e5d83995e didn't go far enough: pretty much
everywhere in the planner that examines a clause's is_pushed_down flag
ought to be changed to use the more complicated behavior where we also
check the clause's required_relids. Otherwise we could make incorrect
decisions about whether, say, a clause is safe to use as a hash clause.
Some (many?) of these places are safe as-is, either because they are
never reached while considering a parameterized path, or because there
are additional checks that would reject a pushed-down clause anyway.
However, it seems smarter to just code them all the same way rather
than rely on easily-broken reasoning of that sort.
In support of that, invent a new macro RINFO_IS_PUSHED_DOWN that should
be used in place of direct tests on the is_pushed_down flag.
Like the previous patch, back-patch to all supported branches.
Discussion: https://postgr.es/m/f8128b11-c5bf-3539-48cd-234178b2314d@proxel.se
|
|
In some cases a clause attached to an outer join can be pushed down into
the outer join's RHS even though the clause is not degenerate --- this
can happen if we choose to make a parameterized path for the RHS. If
the clause ends up attached to a lower outer join, we'd misclassify it
as being a "join filter" not a plain "filter" condition at that node,
leading to wrong query results.
To fix, teach extract_actual_join_clauses to examine each join clause's
required_relids, not just its is_pushed_down flag. (The latter now
seems vestigial, or at least in need of rethinking, but we won't do
anything so invasive as redefining it in a bug-fix patch.)
This has been wrong since we introduced parameterized paths in 9.2,
though it's evidently hard to hit given the lack of previous reports.
The test case used here involves a lateral function call, and I think
that a lateral reference may be required to get the planner to select
a broken plan; though I wouldn't swear to that. In any case, even if
LATERAL is needed to trigger the bug, it still affects all supported
branches, so back-patch to all.
Per report from Andreas Karlsson. Thanks to Andrew Gierth for
preliminary investigation.
Discussion: https://postgr.es/m/f8128b11-c5bf-3539-48cd-234178b2314d@proxel.se
|
|
One of the things canonicalize_qual() does is to remove constant-NULL
subexpressions of top-level AND/OR clauses. It does that on the assumption
that what it's given is a top-level WHERE clause, so that NULL can be
treated like FALSE. Although this is documented down inside a subroutine
of canonicalize_qual(), it wasn't mentioned in the documentation of that
function itself, and some callers hadn't gotten that memo.
Notably, commit d007a9505 caused get_relation_constraints() to apply
canonicalize_qual() to CHECK constraints. That allowed constraint
exclusion to misoptimize situations in which a CHECK constraint had a
provably-NULL subclause, as seen in the regression test case added here,
in which a child table that should be scanned is not. (Although this
thinko is ancient, the test case doesn't fail before 9.2, for reasons
I've not bothered to track down in detail. There may be related cases
that do fail before that.)
More recently, commit f0e44751d added an independent bug by applying
canonicalize_qual() to index expressions, which is even sillier since
those might not even be boolean. If they are, though, I think this
could lead to making incorrect index entries for affected index
expressions in v10. I haven't attempted to prove that though.
To fix, add an "is_check" parameter to canonicalize_qual() to specify
whether it should assume WHERE or CHECK semantics, and make it perform
NULL-elimination accordingly. Adjust the callers to apply the right
semantics, or remove the call entirely in cases where it's not known
that the expression has one or the other semantics. I also removed
the call in some cases involving partition expressions, where it should
be a no-op because such expressions should be canonical already ...
and was a no-op, independently of whether it could in principle have
done something, because it was being handed the qual in implicit-AND
format which isn't what it expects. In HEAD, add an Assert to catch
that type of mistake in future.
This represents an API break for external callers of canonicalize_qual().
While that's intentional in HEAD to make such callers think about which
case applies to them, it seems like something we probably wouldn't be
thanked for in released branches. Hence, in released branches, the
extra parameter is added to a new function canonicalize_qual_ext(),
and canonicalize_qual() is a wrapper that retains its old behavior.
Patch by me with suggestions from Dean Rasheed. Back-patch to all
supported branches.
Discussion: https://postgr.es/m/24475.1520635069@sss.pgh.pa.us
|
|
Since 9.5, it's possible that some but not all columns of an index
support returning the indexed value for index-only scans. If the
same indexed column appears in index columns that behave both ways,
check_index_only() supposed that it'd be OK to do an index-only scan
testing that column; but that fails if we have to recheck the indexed
condition on one of the columns that doesn't support this.
In principle we could make this work by remapping the recheck expressions
to pull the value from a column that does support returning the indexed
value. But such cases are so weird and rare that, at least for now,
it doesn't seem worth the trouble. Instead, just teach check_index_only
that a value is returnable only if all the index columns containing it
are returnable, rather than any of them.
Per report from David Pereiro Lagares. Back-patch to 9.5 where the
possibility of this situation appeared.
Kyotaro Horiguchi
Discussion: https://postgr.es/m/1516210494.1798.16.camel@nlpgo.com
|
|
Given overlapping or partially redundant join clauses, for example
t1 JOIN t2 ON t1.a = t2.x AND t1.b = t2.x
the planner's EquivalenceClass machinery will ordinarily refactor the
clauses as "t1.a = t1.b AND t1.a = t2.x", so that join processing doesn't
see multiple references to the same EquivalenceClass in a list of join
equality clauses. However, if the join is outer, it's incorrect to derive
a restriction clause on the outer side from the join conditions, so the
clause refactoring does not happen and we end up with overlapping join
conditions. The code that attempted to deal with such cases had several
subtle bugs, which could result in "left and right pathkeys do not match in
mergejoin" or "outer pathkeys do not match mergeclauses" planner errors,
if the selected join plan type was a mergejoin. (It does not appear that
any actually incorrect plan could have been emitted.)
The core of the problem really was failure to recognize that the outer and
inner relations' pathkeys have different relationships to the mergeclause
list. A join's mergeclause list is constructed by reference to the outer
pathkeys, so it will always be ordered the same as the outer pathkeys, but
this cannot be presumed true for the inner pathkeys. If the inner sides of
the mergeclauses contain multiple references to the same EquivalenceClass
({t2.x} in the above example) then a simplistic rendering of the required
inner sort order is like "ORDER BY t2.x, t2.x", but the pathkey machinery
recognizes that the second sort column is redundant and throws it away.
The mergejoin planning code failed to account for that behavior properly.
One error was to try to generate cut-down versions of the mergeclause list
from cut-down versions of the inner pathkeys in the same way as the initial
construction of the mergeclause list from the outer pathkeys was done; this
could lead to choosing a mergeclause list that fails to match the outer
pathkeys. The other problem was that the pathkey cross-checking code in
create_mergejoin_plan treated the inner and outer pathkey lists
identically, whereas actually the expectations for them must be different.
That led to false "pathkeys do not match" failures in some cases, and in
principle could have led to failure to detect bogus plans in other cases,
though there is no indication that such bogus plans could be generated.
Reported by Alexander Kuzmenkov, who also reviewed this patch. This has
been broken for years (back to around 8.3 according to my testing), so
back-patch to all supported branches.
Discussion: https://postgr.es/m/5dad9160-4632-0e47-e120-8e2082000c01@postgrespro.ru
|
|
create_plan_recurse lacked any stack depth check. This is not per
our normal coding rules, but I'd supposed it was safe because earlier
planner processing is more complex and presumably should eat more
stack. But bug #15033 from Andrew Grossman shows this isn't true,
at least not for queries having the form of a many-thousand-way
INTERSECT stack.
Further testing showed that recurse_set_operations is also capable
of being crashed in this way, since it likewise will recurse to the
bottom of a parsetree before calling any support functions that
might themselves contain any stack checks. However, its stack
consumption is only perhaps a third of create_plan_recurse's.
It's possible that this particular problem with create_plan_recurse can
only manifest in 9.6 and later, since before that we didn't build a Path
tree for set operations. But having seen this example, I now have no
faith in the proposition that create_plan_recurse doesn't need a stack
check, so back-patch to all supported branches.
Discussion: https://postgr.es/m/20180127050845.28812.58244@wrigleys.postgresql.org
|
|
If we're inside a lateral subquery, there may be no unparameterized paths
for a particular child relation of an appendrel, in which case we *must*
be able to create similarly-parameterized paths for each other child
relation, else the planner will fail with "could not devise a query plan
for the given query". This means that there are situations where we'd
better be able to reparameterize at least one path for each child.
This calls into question the assumption in reparameterize_path() that
it can just punt if it feels like it. However, the only case that is
known broken right now is where the child is itself an appendrel so that
all its paths are AppendPaths. (I think possibly I disregarded that in
the original coding on the theory that nested appendrels would get folded
together --- but that only happens *after* reparameterize_path(), so it's
not excused from handling a child AppendPath.) Given that this code's been
like this since 9.3 when LATERAL was introduced, it seems likely we'd have
heard of other cases by now if there were a larger problem.
Per report from Elvis Pranskevichus. Back-patch to 9.3.
Discussion: https://postgr.es/m/5981018.zdth1YWmNy@hammer.magicstack.net
|
|
If a query against an inheritance tree runs concurrently with an ALTER
TABLE that's disinheriting one of the tree members, it's possible to get
a "could not find inherited attribute" error because after obtaining lock
on the removed member, make_inh_translation_list sees that its columns
have attinhcount=0 and decides they aren't the columns it's looking for.
An ideal fix, perhaps, would avoid including such a just-removed member
table in the query at all; but there seems no way to accomplish that
without adding expensive catalog rechecks or creating a likelihood of
deadlocks. Instead, let's just drop the check on attinhcount. In this
way, a query that's included a just-disinherited child will still
succeed, which is not a completely unreasonable behavior.
This problem has existed for a long time, so back-patch to all supported
branches. Also add an isolation test verifying related behaviors.
Patch by me; the new isolation test is based on Kyotaro Horiguchi's work.
Discussion: https://postgr.es/m/20170626.174612.23936762.horiguchi.kyotaro@lab.ntt.co.jp
|
|
If we flatten a subquery whose target list contains constants or
expressions, when those output columns are used in GROUPING SET columns,
the planner was capable of doing the wrong thing by merging a pulled-up
expression into the surrounding expression during const-simplification.
Then the late processing that attempts to match subexpressions to grouping
sets would fail to match those subexpressions to grouping sets, with the
effect that they'd not go to null when expected.
To fix, wrap such subquery outputs in PlaceHolderVars, ensuring that
they preserve their separate identity throughout the planner's expression
processing. This is a bit of a band-aid, because the wrapper defeats
const-simplification even in places where it would be safe to allow.
But a nicer fix would likely be too invasive to back-patch, and the
consequences of the missed optimizations probably aren't large in most
cases.
Back-patch to 9.5 where grouping sets were introduced.
Heikki Linnakangas, with small mods and better test cases by me;
additional review by Andrew Gierth
Discussion: https://postgr.es/m/7dbdcf5c-b5a6-ef89-4958-da212fe10176@iki.fi
|
|
Since 9.4, we've allowed the syntax "select union select" and variants
of that. However, the planner wasn't expecting a no-column set operation
and ended up treating the set operation as if it were UNION ALL.
Pre-v10, there seem to be some executor issues that would need to be
fixed to support such cases, and it doesn't really seem worth expending
much effort on. Just disallow it, instead.
Per report from Victor Yegorov.
Discussion: https://postgr.es/m/CAGnEbojGJrRSOgJwNGM7JSJZpVAf8xXcVPbVrGdhbVEHZ-BUMw@mail.gmail.com
|
|
rewriteTargetListUD's processing is dependent on the relkind of the query's
target table. That was fine at the time it was made to act that way, even
for queries on inheritance trees, because all tables in an inheritance tree
would necessarily be plain tables. However, the 9.5 feature addition
allowing some members of an inheritance tree to be foreign tables broke the
assumption that rewriteTargetListUD's output tlist could be applied to all
child tables with nothing more than column-number mapping. This led to
visible failures if foreign child tables had row-level triggers, and would
also break in cases where child tables belonged to FDWs that used methods
other than CTID for row identification.
To fix, delay running rewriteTargetListUD until after the planner has
expanded inheritance, so that it is applied separately to the (already
mapped) tlist for each child table. We can conveniently call it from
preprocess_targetlist. Refactor associated code slightly to avoid the
need to heap_open the target relation multiple times during
preprocess_targetlist. (The APIs remain a bit ugly, particularly around
the point of which steps scribble on parse->targetList and which don't.
But avoiding such scribbling would require a change in FDW callback APIs,
which is more pain than it's worth.)
Also fix ExecModifyTable to ensure that "tupleid" is reset to NULL when
we transition from rows providing a CTID to rows that don't. (That's
really an independent bug, but it manifests in much the same cases.)
Add a regression test checking one manifestation of this problem, which
was that row-level triggers on a foreign child table did not work right.
Back-patch to 9.5 where the problem was introduced.
Etsuro Fujita, reviewed by Ildus Kurbangaliev and Ashutosh Bapat
Discussion: https://postgr.es/m/20170514150525.0346ba72@postgrespro.ru
|
|
Previously, we skipped using search_indexed_tlist_for_sortgroupref()
if the tlist expression being sought in the child plan node was merely
a Var. This is purely an optimization, based on the theory that
search_indexed_tlist_for_var() is faster, and one copy of a Var should
be as good as another. However, the GROUPING SETS patch broke the
latter assumption: grouping columns containing the "same" Var can
sometimes have different outputs, as shown in the test case added here.
So do it the hard way whenever a ressortgroupref marking exists.
(If this seems like a bottleneck, we could imagine building a tlist index
data structure for ressortgroupref values, as we do for Vars. But I'll
let that idea go until there's some evidence it's worthwhile.)
Back-patch to 9.6. The problem also exists in 9.5 where GROUPING SETS
came in, but this patch is insufficient to resolve the problem in 9.5:
there is some obscure dependency on the upper-planner-pathification
work that happened in 9.6. Given that this is such a weird corner case,
and no end users have complained about it, it doesn't seem worth the work
to develop a fix for 9.5.
Patch by me, per a report from Heikki Linnakangas. (This does not fix
Heikki's original complaint, just the follow-on one.)
Discussion: https://postgr.es/m/aefc657e-edb2-64d5-6df1-a0828f6e9104@iki.fi
|
|
This lets it do the right thing for, eg, varchar columns.
Back-patch to 9.5 where this logic appeared.
David Rowley, per report from Kim Rose Carlsen
Discussion: https://postgr.es/m/VI1PR05MB17091F9A9876528055D6A827C76D0@VI1PR05MB1709.eurprd05.prod.outlook.com
|
|
If the user modifies a view that has CHECK OPTIONs and this gets
translated into a modification to an underlying relation which happens
to be a foreign table, the check options should be enforced. In the
normal code path, that was happening properly, but it was not working
properly for "direct" modification because the whole operation gets
pushed to the remote side in that case and we never have an option to
enforce the constraint against individual tuples. Fix by disabling
direct modification when there is a need to enforce CHECK OPTIONs.
Etsuro Fujita, reviewed by Kyotaro Horiguchi and by me.
Discussion: http://postgr.es/m/f8a48f54-6f02-9c8a-5250-9791603171ee@lab.ntt.co.jp
|
|
David Rowley found that the "use the smallest per-column selectivity"
heuristic applied in some cases by get_foreign_key_join_selectivity()
was badly off if the FK columns are independent, producing estimates
much worse than we got before that code was added in 9.6.
One case where that heuristic was used was for LEFT and FULL outer joins
with the referenced rel on the outside of the join. But we should not
really need to special-case those here. eqjoinsel() never has had such a
special case; the correction is applied by calc_joinrel_size_estimate()
instead. Let's just estimate such cases like inner joins and rely on that
later adjustment. (I think there was something of a thinko here, in that
the comments seem to be thinking about the selectivity as defined for
semi/anti joins; but that shouldn't apply to left/full joins.) Add a
regression test exercising such a case to show that this is sane in
at least some cases.
The other case where we used that heuristic was for SEMI/ANTI outer joins,
either if the referenced rel was on the outside, or if it was on the inside
but was part of a join within the RHS. In either case, the FK doesn't give
us a lot of traction towards estimating the selectivity. To ensure that
we don't have regressions from what happened before 9.6, let's punt by
ignoring the FK in such cases and applying the traditional selectivity
calculation. (We might be able to improve on that later, but for now
I just want to be sure it's not worse than 9.5.)
Report and patch by David Rowley, simplified a bit by me. Back-patch
to 9.6 where this code was added.
Discussion: https://postgr.es/m/CAKJS1f8NO8oCDcxrteohG6O72uU1saEVT9qX=R8pENr5QWerXw@mail.gmail.com
|