| Age | Commit message (Collapse) | Author |
|
When pulling up a subquery, we may need to wrap its targetlist items
in PlaceHolderVars to enforce separate identity or as a result of
outer joins. However, this causes any upper-level WHERE clauses
referencing these outputs to contain PlaceHolderVars, which prevents
indxpath.c from recognizing that they could be matched to index
columns or index expressions, potentially affecting the planner's
ability to use indexes.
To fix, explicitly strip PlaceHolderVars from index operands. A
PlaceHolderVar appearing in a relation-scan-level expression is
effectively a no-op. Nevertheless, to play it safe, we strip only
PlaceHolderVars that are not marked nullable.
The stripping is performed recursively to handle cases where
PlaceHolderVars are nested or interleaved with other node types. To
minimize performance impact, we first use a lightweight walker to
check for the presence of strippable PlaceHolderVars. The expensive
mutator is invoked only if a candidate is found, avoiding unnecessary
memory allocation and tree copying in the common case where no
PlaceHolderVars are present.
Back-patch to v18. Although this issue exists before that, changes in
this version made it common enough to notice. Given the lack of field
reports for older versions, I am not back-patching further.
Reported-by: Haowu Ge <gehaowu@bitmoe.com>
Author: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/62af586c-c270-44f3-9c5e-02c81d537e3d.gehaowu@bitmoe.com
Backpatch-through: 18
|
|
If there are any SRFs in a PathTarget, we must separate it into
SRF-computing and SRF-free targets. This is because the executor can
only handle SRFs that appear at the top level of the targetlist of a
ProjectSet plan node.
If we find a subexpression that matches an expression already computed
in the previous plan level, we should treat it like a Var and should
not split it again. setrefs.c will later replace the expression with
a Var referencing the subplan output.
However, when processing the grouping target for grouping sets, the
planner can fail to recognize that an expression is already computed
in the scan/join phase. The root cause is a mismatch in the
nullingrels bits. Expressions in the grouping target carry the
grouping nulling bit in their nullingrels to indicate that they can be
nulled by the grouping step. However, the corresponding expressions
in the scan/join target do not have these bits.
As a result, the exact match check in list_member() fails, leading the
planner to incorrectly believe that the expression needs to be
re-evaluated from its arguments, which are often not available in the
subplan. This can lead to planner errors such as "variable not found
in subplan target list".
To fix, ignore the grouping nulling bit when checking whether an
expression from the grouping target is available in the pre-grouping
input target. This aligns with the matching logic in setrefs.c.
Backpatch to v18, where this issue was introduced.
Bug: #19353
Reported-by: Marian MULLER REBEYROL <marian.muller@serli.com>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/19353-aaa179bba986a19b@postgresql.org
Backpatch-through: 18
|
|
The idea is to encourage more the use of these new routines across the
tree, as these offer stronger type safety guarantees than palloc().
This batch of changes includes most of the trivial changes suggested by
the author for src/backend/.
A total of 334 files are updated here. Among these files, 48 of them
have their build change slightly; these are caused by line number
changes as the new allocation formulas are simpler, shaving around 100
lines of code in total.
Similar work has been done in 0c3c5c3b06a3 and 31d3847a37be.
Author: David Geier <geidav.pg@gmail.com>
Discussion: https://postgr.es/m/ad0748d4-3080-436e-b0bc-ac8f86a3466a@gmail.com
|
|
query_is_distinct_for() is intended to determine whether a query never
returns duplicates of the specified columns. For queries using
grouping sets, if there are no grouping expressions, the query may
contain one or more empty grouping sets. The goal is to detect
whether there is exactly one empty grouping set, in which case the
query would return a single row and thus be distinct.
The previous logic in query_is_distinct_for() was incomplete because
the check was insufficiently thorough and could return false when it
could have returned true. It failed to consider cases where the
DISTINCT clause is used on the GROUP BY, in which case duplicate empty
grouping sets are removed, leaving only one. It also did not
correctly handle all possible structures of GroupingSet nodes that
represent a single empty grouping set.
To fix, add a check for the groupDistinct flag, and expand the query's
groupingSets tree into a flat list, then verify that the expanded list
contains only one element.
No backpatch as this could result in plan changes.
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAMbWs480Z04NtP8-O55uROq2Zego309+h3hhaZhz6ztmgWLEBw@mail.gmail.com
|
|
apply_scanjoin_target_to_paths wants to avoid useless work and
platform-specific dependencies by throwing away the path list created
prior to applying the final scan/join target and constructing a whole
new one using the final scan/join target. However, this is only valid
when we'll consider all the same strategies after the pathlist reset
as before.
After resetting the path list, we reconsider Append and MergeAppend
paths with the modified target list; therefore, it's only valid for a
partitioned relation. However, what the previous coding missed is that
it cannot be a partitioned join relation, because that also has paths
that are not Append or MergeAppend paths and will not be reconsidered.
Thus, before this patch, we'd sometimes choose a partitionwise strategy
with a higher total cost than cheapest non-partitionwise strategy,
which is not good.
We had a surprising number of tests cases that were relying on this
bug to work as they did. A big part of the reason for this is that row
counts in regression test cases tend to be low, which brings the cost
of partitionwise and non-partitionwise strategies very close together,
especially for merge joins, where the real and perceived advantages of
a partitionwise approach are minimal. In addition, one test case
included a row-count-inflating join. In such cases, a partitionwise
join can easily be a loser on cost, because the total number of tuples
passing through an Append node is much higher than it is with a
non-partitionwise strategy. That test case is adjusted by adding
additional join clauses to avoid the row count inflation.
Although the failure of the planner to choose the lowest-cost path is a
bug, we generally do not back-patch fixes of this type, because planning
is not an exact science and there is always a possibility that some user
will end up with a plan that has a lower estimated cost but actually
runs more slowly. Hence, no backpatch here, either.
The code change here is exactly what was originally proposed by
Ashutosh, but the changes to the comments and test cases have been
very heavily rewritten by me, helped along by some very useful advice
from Richard Guo.
Reported-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Author: Robert Haas <rhaas@postgresql.org>
Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com>
Reviewed-by: Arne Roland <arne.roland@malkut.net>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Discussion: http://postgr.es/m/CAExHW5toze58+jL-454J3ty11sqJyU13Sz5rJPQZDmASwZgWiA@mail.gmail.com
|
|
This removes some casts where the input already has the same type as
the type specified by the cast. Their presence could cause risks of
hiding actual type mismatches in the future or silently discarding
qualifiers. It also improves readability. Same kind of idea as
7f798aca1d5 and ef8fe693606. (This does not change all such
instances, but only those hand-picked by the author.)
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/aSQy2JawavlVlEB0%40ip-10-97-1-34.eu-west-3.compute.internal
|
|
This adds SupportRequestSimplifyAggref to allow pg_proc.prosupport
functions to receive an Aggref and allow them to determine if there is a
way that the Aggref call can be optimized.
Also added is a support function to allow transformation of COUNT(ANY)
into COUNT(*). This is possible to do when the given "ANY" cannot be
NULL and also that there are no ORDER BY / DISTINCT clauses within the
Aggref. This is a useful transformation to do as it is common that
people write COUNT(1), which until now has added unneeded overhead.
When counting a NOT NULL column. The overheads can be worse as that
might mean deforming more of the tuple, which for large fact tables may
be many columns in.
It may be possible to add prosupport functions for other aggregates. We
could consider if ORDER BY could be dropped for some calls, e.g. the
ORDER BY is quite useless in MAX(c ORDER BY c).
There is a little bit of passing fallout from adjusting
expr_is_nonnullable() to handle Const which results in a plan change in
the aggregates.out regression test. Previously, nothing was able to
determine that "One-Time Filter: (100 IS NOT NULL)" was always true,
therefore useless to include in the plan.
Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://postgr.es/m/CAApHDvqGcPTagXpKfH=CrmHBqALpziThJEDs_MrPqjKVeDF9wA@mail.gmail.com
|
|
The comment for ChangeVarNodes() refers to a parameter named
change_RangeTblRef, which does not exist in the code.
The comment for ChangeVarNodesExtended() contains an extra space,
while the comment for replace_relid_callback() has an awkward line
break and a typo.
This patch fixes these issues and revises some sentences for smoother
wording.
Oversights in commits ab42d643c and fc069a3a6.
Author: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAMbWs480j16HC1JtjKCgj5WshivT8ZJYkOfTyZAM0POjFomJkg@mail.gmail.com
Backpatch-through: 18
|
|
We've been nibbling away at removing uses of "long" for a long time,
since its width is platform-dependent. Here's one more: change the
remaining "long" fields in Plan nodes to Cardinality, since the three
surviving examples all represent group-count estimates. The upstream
planner code was converted to Cardinality some time ago; for example
the corresponding fields in Path nodes are type Cardinality, as are
the arguments of the make_foo_path functions. Downstream in the
executor, it turns out that these all feed to the table-size argument
of BuildTupleHashTable. Change that to "double" as well, and fix it
so that it safely clamps out-of-range values to the uint32 limit of
simplehash.h, as was not being done before.
Essentially, this is removing all the artificial datatype-dependent
limitations on these values from upstream processing, and applying
just one clamp at the moment where we're forced to do so by the
datatype choices of simplehash.h.
Also, remove BuildTupleHashTable's misguided attempt to enforce
work_mem/hash_mem_limit. It doesn't have enough information
(particularly not the expected tuple width) to do that accurately,
and it has no real business second-guessing the caller's choice.
For all these plan types, it's really the planner's responsibility
to not choose a hashed implementation if the hashtable is expected
to exceed hash_mem_limit. The previous patch improved the
accuracy of those estimates, and even if BuildTupleHashTable had
more information it should arrive at the same conclusions.
Reported-by: Jeff Janes <jeff.janes@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAMkU=1zia0JfW_QR8L5xA2vpa0oqVuiapm78h=WpNsHH13_9uw@mail.gmail.com
|
|
For several types of plan nodes that use TupleHashTables, the
planner estimated the expected size of the table as basically
numEntries * (MAXALIGN(dataWidth) + MAXALIGN(SizeofHeapTupleHeader)).
This is pretty far off, especially for small data widths, because
it doesn't account for the overhead of the simplehash.h hash table
nor for any per-tuple "additional space" the plan node may request.
Jeff Janes noted a case where the estimate was off by about a factor
of three, even though the obvious hazards such as inaccurate estimates
of numEntries or dataWidth didn't apply.
To improve matters, create functions provided by the relevant executor
modules that can estimate the required sizes with reasonable accuracy.
(We're still not accounting for effects like allocator padding, but
this at least gets the first-order effects correct.)
I added functions that can estimate the tuple table sizes for
nodeSetOp and nodeSubplan; these rely on an estimator for
TupleHashTables in general, and that in turn relies on one for
simplehash.h hash tables. That feels like kind of a lot of mechanism,
but if we take any short-cuts we're violating modularity boundaries.
The other places that use TupleHashTables are nodeAgg, which took
pains to get its numbers right already, and nodeRecursiveunion.
I did not try to improve the situation for nodeRecursiveunion because
there's nothing to improve: we are not making an estimate of the hash
table size, and it wouldn't help us to do so because we have no
non-hashed alternative implementation. On top of that, our estimate
of the number of entries to be hashed in that module is so suspect
that we'd likely often choose the wrong implementation if we did have
two ways to do it.
Reported-by: Jeff Janes <jeff.janes@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAMkU=1zia0JfW_QR8L5xA2vpa0oqVuiapm78h=WpNsHH13_9uw@mail.gmail.com
|
|
67a54b9e8 taught the planner to push down HAVING clauses even when
grouping sets are present, as long as the clause does not reference
any columns that are nullable by the grouping sets. However, there
was an oversight: if any empty grouping sets are present, the
aggregation node can produce a row that did not come from the input,
and pushing down a HAVING clause in this case may cause us to fail to
filter out that row.
Currently, non-degenerate HAVING clauses are not pushed down when
empty grouping sets are present, since the empty grouping sets would
nullify the vars they reference. However, degenerate (variable-free)
HAVING clauses are not subject to this restriction and may be
incorrectly pushed down.
To fix, explicitly check for the presence of empty grouping sets and
retain degenerate clauses in HAVING when they are present. This
ensures that we don't emit a bogus aggregated row. A copy of each
such clause is also put in WHERE so that query_planner() can use it in
a gating Result node.
To facilitate this check, this patch expands the groupingSets tree of
the query to a flat list of grouping sets before applying the HAVING
pushdown optimization. This does not add any additional planning
overhead, since we need to do this expansion anyway.
In passing, make a small tweak to preprocess_grouping_sets() by
reordering its initial operations a bit.
Backpatch to v18, where this issue was introduced.
Reported-by: Yuhang Qiu <iamqyh@gmail.com>
Author: Richard Guo <guofenglinux@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/0879D9C9-7FE2-4A20-9593-B23F7A0B5290@gmail.com
Backpatch-through: 18
|
|
This information appears to have been unused since commit
c5b7ba4e67. We could not find any references in third-party code,
either.
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/aO_CyFRpbVMtgJWM%40nathan
|
|
In initsplan.c, no macros for built-in function OIDs are used, so this
include is unnecessary and can be removed. This was my oversight in
commit 8e1185910.
Discussion: https://postgr.es/m/CAMbWs4_-sag-cAKrLJ+X+5njL1=oudk=+KfLmsLZ5a2jckn=kg@mail.gmail.com
|
|
These hooks allow plugins to get control at the earliest point at
which the PlannerGlobal object is fully initialized, and then just
before it gets destroyed. This is useful in combination with the
extendable plan state facilities (see extendplan.h) and perhaps for
other purposes as well.
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: http://postgr.es/m/CA+TgmoYWKHU2hKr62Toyzh-kTDEnMDeLw7gkOOnjL-TnOUq0kQ@mail.gmail.com
|
|
This allows extensions to have access to any data they've stored
in the ExplainState during planning. Unfortunately, it won't help
with EXPLAIN EXECUTE is used, but since that case is less common,
this still seems like an improvement.
Since planner() has quite a few arguments now, also add some
documentation of those arguments and the return value.
Author: Robert Haas <rhaas@postgresql.org>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: http://postgr.es/m/CA+TgmoYWKHU2hKr62Toyzh-kTDEnMDeLw7gkOOnjL-TnOUq0kQ@mail.gmail.com
|
|
Eager aggregation is a query optimization technique that partially
pushes aggregation past a join, and finalizes it once all the
relations are joined. Eager aggregation may reduce the number of
input rows to the join and thus could result in a better overall plan.
In the current planner architecture, the separation between the
scan/join planning phase and the post-scan/join phase means that
aggregation steps are not visible when constructing the join tree,
limiting the planner's ability to exploit aggregation-aware
optimizations. To implement eager aggregation, we collect information
about aggregate functions in the targetlist and HAVING clause, along
with grouping expressions from the GROUP BY clause, and store it in
the PlannerInfo node. During the scan/join planning phase, this
information is used to evaluate each base or join relation to
determine whether eager aggregation can be applied. If applicable, we
create a separate RelOptInfo, referred to as a grouped relation, to
represent the partially-aggregated version of the relation and
generate grouped paths for it.
Grouped relation paths can be generated in two ways. The first method
involves adding sorted and hashed partial aggregation paths on top of
the non-grouped paths. To limit planning time, we only consider the
cheapest or suitably-sorted non-grouped paths in this step.
Alternatively, grouped paths can be generated by joining a grouped
relation with a non-grouped relation. Joining two grouped relations
is currently not supported.
To further limit planning time, we currently adopt a strategy where
partial aggregation is pushed only to the lowest feasible level in the
join tree where it provides a significant reduction in row count.
This strategy also helps ensure that all grouped paths for the same
grouped relation produce the same set of rows, which is important to
support a fundamental assumption of the planner.
For the partial aggregation that is pushed down to a non-aggregated
relation, we need to consider all expressions from this relation that
are involved in upper join clauses and include them in the grouping
keys, using compatible operators. This is essential to ensure that an
aggregated row from the partial aggregation matches the other side of
the join if and only if each row in the partial group does. This
ensures that all rows within the same partial group share the same
"destiny", which is crucial for maintaining correctness.
One restriction is that we cannot push partial aggregation down to a
relation that is in the nullable side of an outer join, because the
NULL-extended rows produced by the outer join would not be available
when we perform the partial aggregation, while with a
non-eager-aggregation plan these rows are available for the top-level
aggregation. Pushing partial aggregation in this case may result in
the rows being grouped differently than expected, or produce incorrect
values from the aggregate functions.
If we have generated a grouped relation for the topmost join relation,
we finalize its paths at the end. The final paths will compete in the
usual way with paths built from regular planning.
The patch was originally proposed by Antonin Houska in 2017. This
commit reworks various important aspects and rewrites most of the
current code. However, the original patch and reviews were very
useful.
Author: Richard Guo <guofenglinux@gmail.com>
Author: Antonin Houska <ah@cybertec.at> (in an older version)
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me> (in an older version)
Reviewed-by: Andy Fan <zhihuifan1213@163.com> (in an older version)
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> (in an older version)
Discussion: https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com
|
|
Instead, use the new mechanism that allows planner extensions to store
private state inside a PlannerInfo, treating GEQO as an in-core planner
extension. This is a useful test of the new facility, and also buys
back a few bytes of storage.
To make this work, we must remove innerrel_is_unique_ext's hack of
testing whether join_search_private is set as a proxy for whether
the join search might be retried. Add a flag that extensions can
use to explicitly signal their intentions instead.
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: http://postgr.es/m/CA+TgmoYWKHU2hKr62Toyzh-kTDEnMDeLw7gkOOnjL-TnOUq0kQ@mail.gmail.com
|
|
Previously, subqueries were given names only after they were planned,
which makes it difficult to use information from a previous execution of
the query to guide future planning. If, for example, you knew something
about how you want "InitPlan 2" to be planned, you won't know whether
the subquery you're currently planning will end up being "InitPlan 2"
until after you've finished planning it, by which point it's too late to
use the information that you had.
To fix this, assign each subplan a unique name before we begin planning
it. To improve consistency, use textual names for all subplans, rather
than, as we did previously, a mix of numbers (such as "InitPlan 1") and
names (such as "CTE foo"), and make sure that the same name is never
assigned more than once.
We adopt the somewhat arbitrary convention of using the type of sublink
to set the plan name; for example, a query that previously had two
expression sublinks shown as InitPlan 2 and InitPlan 1 will now end up
named expr_1 and expr_2. Because names are assigned before rather than
after planning, some of the regression test outputs show the numerical
part of the name switching positions: what was previously SubPlan 2 was
actually the first one encountered, but we finished planning it later.
We assign names even to subqueries that aren't shown as such within the
EXPLAIN output. These include subqueries that are a FROM clause item or
a branch of a set operation, rather than something that will be turned
into an InitPlan or SubPlan. The purpose of this is to make sure that,
below the topmost query level, there's always a name for each subquery
that is stable from one planning cycle to the next (assuming no changes
to the query or the database schema).
Author: Robert Haas <rhaas@postgresql.org>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: http://postgr.es/m/3641043.1758751399@sss.pgh.pa.us
|
|
The prior code, added in 03d40e4b5 attempted to use the targetlist of the
first UNION child when all UNION children were proven as dummy rels.
That's not going to work when some operation atop of the Result node must
find target entries within the Result's targetlist. This could have been
something as simple as trying to sort the results of the UNION operation,
which would lead to:
ERROR: could not find pathkey item to sort
Instead, use the top-level UNION's targetlist and fix the varnos in
setrefs.c. Because set operation targetlists always use varno==0, we
can rewrite those to become varno==1, i.e. use the Vars from the first
UNION child. This does result in showing Vars from relations that are
not present in the final plan, but that's no different to what we see
when normal base relations are proven dummy.
Without this fix it would be possible to see the following error in
EXPLAIN VERBOSE when all UNION inputs were proven empty.
ERROR: bogus varno: 0
Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvrUASy9sfULMEsM2udvZJP6AoBRCZvHYXYxZTy2tX9FYw@mail.gmail.com
|
|
Result nodes now include an RTI set, which is only non-NULL when they
have no subplan, and is taken from the relid set of the RelOptInfo that
the Result is generating. ExplainPreScanNode now takes notice of these
RTIs, which means that a few things get schema-qualified in the
regression tests that previously did not. This makes the output more
consistent between cases where some part of the plan tree is replaced by
a Result node and those where this does not happen.
Likewise, pg_overexplain's EXPLAIN (RANGE_TABLE) now displays the RTIs
stored in a Result node just as it already does for other RTI-bearing
node types.
Result nodes also now include a result_reason, which tells us something
about why the Result node was inserted. Using that information, EXPLAIN
now emits, where relevant, a "Replaces" line describing the origin of
a Result node.
The purpose of these changes is to allow code that inspects a Plan
tree to understand the origin of Result nodes that appear therein.
Discussion: http://postgr.es/m/CA+TgmoYeUZePZWLsSO+1FAN7UPePT_RMEZBKkqYBJVCF1s60=w@mail.gmail.com
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
|
|
This is just like the previous commit, but for a different invented
alias name.
Author: Robert Haas <rhaas@postgresql.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA+TgmoYSYmDA2GvanzPMci084n+mVucv0bJ0HPbs6uhmMN6HMg@mail.gmail.com
|
|
For a child relation, we should not assume that its parent's
unique-ified relation (or unique-ified path in v18) always exists. In
cases where all RHS columns that need to be unique-ified are equated
to constants, the unique-ified relation/path for the parent table is
not built, as there are no columns left to unique-ify. Failing to
account for this can result in a SIGSEGV crash during planning.
This patch checks whether the parent's unique-ified relation or path
exists and skips unique-ification of the child relation if it does
not.
Author: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAMbWs49MOdLW2c+qbLHHBt8VBu=4ONpM91D19=AWeW93eFUF6A@mail.gmail.com
Backpatch-through: 18
|
|
One mechanism we have for implementing semi-joins is to de-duplicate
the output of the RHS and then treat the join as a plain inner join.
Initial construction of the join's SpecialJoinInfo identifies the
RHS columns that need to be de-duplicated, but later we may find that
some of those don't need to be handled explicitly, either because
they're known to be constant or because they are redundant with some
previous column.
Up to now, while sort-based de-duplication handled such cases well,
hash-based de-duplication didn't: we'd still hash on all of the
originally-identified columns. This is probably not a very big
deal performance-wise, but in the wake of commit a3179ab69 it can
cause planner errors. That happens when join elimination causes
recalculation of variables' attr_needed bitmapsets, and we decide
that a variable mentioned in a semijoin clause doesn't need to be
propagated up to the join level anymore.
There are a number of ways we could slice the blame for this, but the
only fix that doesn't result in pessimizing plans for loosely-related
cases is to be more careful about not hashing columns we don't
actually need to de-duplicate. We can install that consideration
into create_unique_paths in master, or the predecessor code in
create_unique_path in v18, without much refactoring.
(As follow-up work, it might be a good idea to look at more-invasive
refactoring, in hopes of preventing other bugs in this area. But
with v18 release so close, there's not time for that now, nor would
we be likely to want to put such refactoring into v18 anyway.)
Reported-by: Sergey Soloviev <sergey.soloviev@tantorlabs.ru>
Diagnosed-by: Richard Guo <guofenglinux@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/1fd1a421-4609-4d46-a1af-ab74d5de504a@tantorlabs.ru
Backpatch-through: 18
|
|
The Self-Join Elimination SJE feature messes up keeping and removing RowMark's
in remove_self_joins_one_group(). That didn't lead to user-level error,
because the planned RowMark is only used to reference a rtable entry in later
execution stages. An RTE entry for keeping and removing relations is
identical and refers to the same relation OID.
To reduce confusion and prevent future issues, this commit cleans up the code
and fixes the incorrect behaviour. Furthermore, it includes sanity checks in
setrefs.c on existing non-null RTE and RelOptInfo entries for each RowMark.
Discussion: https://postgr.es/m/18c6bd6c-6d2a-419a-b0da-dfedef34b585%40gmail.com
Author: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com>
Backpatch-through: 18
|
|
Rename inner and outer to rrel and krel, respectively, to highlight their
connection to r and k indexes. For the same reason, rename imark and omark
to rmark and kmark.
Discussion: https://postgr.es/m/18c6bd6c-6d2a-419a-b0da-dfedef34b585%40gmail.com
Author: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com>
Backpatch-through: 18
|
|
Now that the only call to relation_has_unique_index_for() that
supplied an exprlist and oprlist has been removed, the loop handling
those lists is effectively dead code. This patch removes that loop
and simplifies the function accordingly.
Author: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAMbWs4-EBnaRvEs7frTLbsXiweSTUXifsteF-d3rvv01FKO86w@mail.gmail.com
|
|
There are two implementation techniques for semijoins: one uses the
JOIN_SEMI jointype, where the executor emits at most one matching row
per left-hand side (LHS) row; the other unique-ifies the right-hand
side (RHS) and then performs a plain inner join.
The latter technique currently has some drawbacks related to the
unique-ification step.
* Only the cheapest-total path of the RHS is considered during
unique-ification. This may cause us to miss some optimization
opportunities; for example, a path with a better sort order might be
overlooked simply because it is not the cheapest in total cost. Such
a path could help avoid a sort at a higher level, potentially
resulting in a cheaper overall plan.
* We currently rely on heuristics to choose between hash-based and
sort-based unique-ification. A better approach would be to generate
paths for both methods and allow add_path() to decide which one is
preferable, consistent with how path selection is handled elsewhere in
the planner.
* In the sort-based implementation, we currently pay no attention to
the pathkeys of the input subpath or the resulting output. This can
result in redundant sort nodes being added to the final plan.
This patch improves semijoin planning by creating a new RelOptInfo for
the RHS rel to represent its unique-ified version. It then generates
multiple paths that represent elimination of distinct rows from the
RHS, considering both a hash-based implementation using the cheapest
total path of the original RHS rel, and sort-based implementations
that either exploit presorted input paths or explicitly sort the
cheapest total path. All resulting paths compete in add_path(), and
those deemed worthy of consideration are added to the new RelOptInfo.
Finally, the unique-ified rel is joined with the other side of the
semijoin using a plain inner join.
As a side effect, most of the code related to the JOIN_UNIQUE_OUTER
and JOIN_UNIQUE_INNER jointypes -- used to indicate that the LHS or
RHS path should be made unique -- has been removed. Besides, the
T_Unique path now has the same meaning for both semijoins and upper
DISTINCT clauses: it represents adjacent-duplicate removal on
presorted input. This patch unifies their handling by sharing the
same data structures and functions.
This patch also removes the UNIQUE_PATH_NOOP related code along the
way, as it is dead code -- if the RHS rel is provably unique, the
semijoin should have already been simplified to a plain inner join by
analyzejoins.c.
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Discussion: https://postgr.es/m/CAMbWs4-EBnaRvEs7frTLbsXiweSTUXifsteF-d3rvv01FKO86w@mail.gmail.com
|
|
Remove conditionally-compiled code for the other case.
Replace uses of FLOAT8PASSBYVAL with constant "true", mainly because
it was quite confusing in cases where the type we were dealing with
wasn't float8.
I left the associated pg_control and Pg_magic_struct fields in place.
Perhaps we should get rid of them, but it would save little, so it
doesn't seem worth thinking hard about the compatibility implications.
I just labeled them "vestigial" in places where that seemed helpful.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/1749799.1752797397@sss.pgh.pa.us
|
|
Commit e2d4ef8de86 (the fix for CVE-2017-7484) added security checks
to the selectivity estimation functions to prevent them from running
user-supplied operators on data obtained from pg_statistic if the user
lacks privileges to select from the underlying table. In cases
involving inheritance/partitioning, those checks were originally
performed against the child RTE (which for plain inheritance might
actually refer to the parent table). Commit 553d2ec2710 then extended
that to also check the parent RTE, allowing access if the user had
permissions on either the parent or the child. It turns out, however,
that doing any checks using the child RTE is incorrect, since
securityQuals is set to NULL when creating an RTE for an inheritance
child (whether it refers to the parent table or the child table), and
therefore such checks do not correctly account for any RLS policies or
security barrier views. Therefore, do the security checks using only
the parent RTE. This is consistent with how RLS policies are applied,
and the executor's ACL checks, both of which use only the parent
table's permissions/policies. Similar checks are performed in the
extended stats code, so update that in the same way, centralizing all
the checks in a new function.
In addition, note that these checks by themselves are insufficient to
ensure that the user has access to the table's data because, in a
query that goes via a view, they only check that the view owner has
permissions on the underlying table, not that the current user has
permissions on the view itself. In the selectivity estimation
functions, there is no easy way to navigate from underlying tables to
views, so add permissions checks for all views mentioned in the query
to the planner startup code. If the user lacks permissions on a view,
a permissions error will now be reported at planner-startup, and the
selectivity estimation functions will not be run.
Checking view permissions at planner-startup in this way is a little
ugly, since the same checks will be repeated at executor-startup.
Longer-term, it might be better to move all the permissions checks
from the executor to the planner so that permissions errors can be
reported sooner, instead of creating a plan that won't ever be run.
However, such a change seems too far-reaching to be back-patched.
Back-patch to all supported versions. In v13, there is the added
complication that UPDATEs and DELETEs on inherited target tables are
planned using inheritance_planner(), which plans each inheritance
child table separately, so that the selectivity estimation functions
do not know that they are dealing with a child table accessed via its
parent. Handle that by checking access permissions on the top parent
table at planner-startup, in the same way as we do for views. Any
securityQuals on the top parent table are moved down to the child
tables by inheritance_planner(), so they continue to be checked by the
selectivity estimation functions.
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Noah Misch <noah@leadboat.com>
Backpatch-through: 13
Security: CVE-2025-8713
|
|
Commit 9e6104c66 disallowed transition tables on foreign tables, but
failed to account for cases where a foreign table is a child table of a
partitioned/inherited table on which transition tables exist, leading to
incorrect transition tuples collected from such foreign tables for
queries on the parent table triggering transition capture. This
occurred not only for inherited UPDATE/DELETE but for partitioned INSERT
later supported by commit 3d956d956, which should have handled it at
least for the INSERT case, but didn't.
To fix, modify ExecAR*Triggers to throw an error if the given relation
is a foreign table requesting transition capture. Also, this commit
fixes make_modifytable so that in case of an inherited UPDATE/DELETE
triggering transition capture, FDWs choose normal operations to modify
child foreign tables, not DirectModify; which is needed because they
would otherwise skip the calls to ExecAR*Triggers at execution, causing
unexpected behavior.
Author: Etsuro Fujita <etsuro.fujita@gmail.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CAPmGK14QJYikKzBDCe3jMbpGENnQ7popFmbEgm-XTNuk55oyHg%40mail.gmail.com
Backpatch-through: 13
|
|
Commit 719dcf3c42 introduced a field called CachedPlanType in
PlannedStmt to allow extensions to determine whether a cached plan is
generic or custom.
After discussion, the concepts that we want to track are a bit wider
than initially anticipated, as it is closer to knowing from which
"source" or "origin" a PlannedStmt has been generated or retrieved.
Custom and generic cached plans are a subset of that.
Based on the state of HEAD, we have been able to define two more
origins:
- "standard", for the case where PlannedStmt is generated in
standard_planner(), the most common case.
- "internal", for the fake PlannedStmt generated internally by some
query patterns.
This could be tuned in the future depending on what is needed. This
looks like a good starting point, at least. The default value is called
"UNKNOWN", provided as fallback value. This value is not used in the
core code, the idea is to let extensions building their own PlannedStmts
know about this new field.
Author: Michael Paquier <michael@paquier.xyz>
Co-authored-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/aILaHupXbIGgF2wJ@paquier.xyz
|
|
There've been a few complaints that it can be overly difficult to figure
out why the planner picked a Memoize plan. To help address that, here we
adjust the EXPLAIN output to display the following additional details:
1) The estimated number of cache entries that can be stored at once
2) The estimated number of unique lookup keys that we expect to see
3) The number of lookups we expect
4) The estimated hit ratio
Technically #4 can be calculated using #1, #2 and #3, but it's not a
particularly obvious calculation, so we opt to display it explicitly.
The original patch by Lukas Fittl only displayed the hit ratio, but
there was a fear that might lead to more questions about how that was
calculated. The idea with displaying all 4 is to be transparent which
may allow queries to be tuned more easily. For example, if #2 isn't
correct then maybe extended statistics or a manual n_distinct estimate can
be used to help fix poor plan choices.
Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com>
Author: Lukas Fittl <lukas@fittl.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CAP53Pky29GWAVVk3oBgKBDqhND0BRBN6yTPeguV_qSivFL5N_g%40mail.gmail.com
|
|
PlannedStmt gains a new field, called CachedPlanType, able to track if a
given plan tree originates from the cache and if we are dealing with a
generic or custom cached plan.
This field can be used for monitoring or statistical purposes, in the
executor hooks, for example, based on the planned statement attached to
a QueryDesc. A patch is under discussion for pg_stat_statements to
provide an equivalent of the counters in pg_prepared_statements for
custom and generic plans, to provide a more global view of such data, as
this data is now restricted to the current session.
The concept introduced in this commit is useful on its own, and has been
extracted from a larger patch by the same author.
Author: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAA5RZ0uFw8Y9GCFvafhC=OA8NnMqVZyzXPfv_EePOt+iv1T-qQ@mail.gmail.com
|
|
In commit b262ad440, we introduced an optimization that reduces an IS
[NOT] NULL qual on a NOT NULL column to constant true or constant
false, provided we can prove that the input expression of the NullTest
is not nullable by any outer joins or grouping sets. This deduction
happens quite late in the planner, during the distribution of quals to
rels in query_planner. However, this approach has some drawbacks: we
can't perform any further folding with the constant, and it turns out
to be prone to bugs.
Ideally, this deduction should happen during constant folding.
However, the per-relation information about which columns are defined
as NOT NULL is not available at that point. This information is
currently collected from catalogs when building RelOptInfos for base
or "other" relations.
This patch moves the collection of NOT NULL attribute information for
relations before pull_up_sublinks, storing it in a hash table keyed by
relation OID. It then uses this information to perform the NullTest
deduction for Vars during constant folding. This also makes it
possible to leverage this information to pull up NOT IN subqueries.
Note that this patch does not get rid of restriction_is_always_true
and restriction_is_always_false. Removing them would prevent us from
reducing some IS [NOT] NULL quals that we were previously able to
reduce, because (a) the self-join elimination may introduce new IS NOT
NULL quals after constant folding, and (b) if some outer joins are
converted to inner joins, previously irreducible NullTest quals may
become reducible.
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAMbWs4-bFJ1At4btk5wqbezdu8PLtQ3zv-aiaY3ry9Ymm=jgFQ@mail.gmail.com
|
|
There are several pieces of catalog information that need to be
retrieved for a relation during the early stage of planning. These
include relhassubclass, which is used to clear the inh flag if the
relation has no children, as well as a column's attgenerated and
default value, which are needed to expand virtual generated columns.
More such information may be required in the future.
Currently, these pieces of catalog data are collected in multiple
places, resulting in repeated table_open/table_close calls for each
relation in the rangetable. This patch centralizes the collection of
all required early-stage catalog information into a single loop over
the rangetable, allowing each relation to be opened and closed only
once.
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAMbWs4-bFJ1At4btk5wqbezdu8PLtQ3zv-aiaY3ry9Ymm=jgFQ@mail.gmail.com
|
|
Currently, we expand virtual generated columns after we have pulled up
any SubLinks within the query's quals. This ensures that the virtual
generated column references within SubLinks that should be transformed
into joins are correctly expanded. This approach works well and has
posed no issues.
In an upcoming patch, we plan to centralize the collection of catalog
information needed early in the planner. This will help avoid
repeated table_open/table_close calls for relations in the rangetable.
Since this information is required during sublink pull-up, we are
moving the expansion of virtual generated columns to occur beforehand.
To achieve this, if any EXISTS SubLinks can be pulled up, their
rangetables are processed just before pulling them up.
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAMbWs4-bFJ1At4btk5wqbezdu8PLtQ3zv-aiaY3ry9Ymm=jgFQ@mail.gmail.com
|
|
For an ordered Append or MergeAppend, we need to inject an explicit
sort into any subpath that is not already well enough ordered.
Currently, only explicit full sorts are considered; incremental sorts
are not yet taken into account.
In this patch, for subpaths of an ordered Append or MergeAppend, we
choose to use explicit incremental sort if it is enabled and there are
presorted keys.
The rationale is based on the assumption that incremental sort is
always faster than full sort when there are presorted keys, a premise
that has been applied in various parts of the code. In addition, the
current cost model tends to favor incremental sort as being cheaper
than full sort in the presence of presorted keys, making it reasonable
not to consider full sort in such cases.
No backpatch as this could result in plan changes.
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CAMbWs4_V7a2enTR+T3pOY_YZ-FU8ZsFYym2swOz4jNMqmSgyuw@mail.gmail.com
|
|
In the wake of commit a16ef313f, we need to deal with more cases
involving PlaceHolderVars in NestLoopParams than we did before.
For one thing, a16ef313f was incorrect to suppose that we could
rely on the required-outer relids of the lefthand path to decide
placement of nestloop-parameter PHVs. As Richard Guo argued at
the time, we must look at the required-outer relids of the join
path itself.
For another, we have to apply replace_nestloop_params() to such
a PHV's expression, in case it contains references to values that
will be supplied from NestLoopParams of higher-level nestloops.
For another, we need to be more careful about the phnullingrels
of the PHV than we were being. identify_current_nestloop_params
only bothered to ensure that the phnullingrels didn't contain
"too many" relids, but now it has to be exact, because setrefs.c
will apply both NRM_SUBSET and NRM_SUPERSET checks in different
places. We can compute the correct relids by determining the
set of outer joins that should be able to null the PHV and then
subtracting whatever's been applied at or below this join.
Do the same for plain Vars, too. (This should make it possible
to use NRM_EQUAL to process nestloop params in setrefs.c, but
I won't risk making such a change in v18 now.)
Lastly, if a nestloop parameter PHV was pulled up out of a subquery
and it contains a subquery that was originally pushed down from this
query level, then that will still be represented as a SubLink, because
SS_process_sublinks won't recurse into outer PHVs, so it didn't get
transformed during expression preprocessing in the subquery. We can
substitute the version of the PHV's expression appearing in its
PlaceHolderInfo to ensure that that preprocessing has happened.
(Seems like this processing sequence could stand to be redesigned,
but again, late in v18 development is not the time for that.)
It's not very clear to me why the old have_dangerous_phv join-order
restriction prevented us from seeing the last three of these problems.
But given the lack of field complaints, it must have done so.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18953-1c9883a9d4afeb30@postgresql.org
|
|
Commit 85e5e222b, which added (a forerunner of) this logic,
argued that
Adding the necessary complexity to make this work doesn't seem like
it would be repaid in significantly better plans, because in cases
where such a PHV exists, there is probably a corresponding join order
constraint that would allow a good plan to be found without using the
star-schema exception.
The flaw in this claim is that there may be other join-order
restrictions that prevent us from finding a join order that doesn't
involve a "dangerous" PHV. In particular we now recognize that
small join_collapse_limit or from_collapse_limit could prevent it.
Therefore, let's bite the bullet and make the case work.
We don't have to extend the executor's support for nestloop parameters
as I thought at the time, because we can instead push the evaluation
of the placeholder's expression into the left-hand input of the
NestLoop node. So there's not really a lot of downside to this
solution, and giving the planner more join-order flexibility should
have value beyond just avoiding failure.
Having said that, there surely is a nonzero risk of introducing
new bugs. Since this failure mode escaped detection for ten years,
such cases don't seem common enough to justify a lot of risk.
Therefore, let's put this fix into master but leave the back branches
alone (for now anyway).
Bug: #18953
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Diagnosed-by: Richard Guo <guofenglinux@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18953-1c9883a9d4afeb30@postgresql.org
|
|
Commit 8492feb98f6 added support for parallel CREATE INDEX on GIN indexes.
However, previously two places in the documentation and two in the source
code comments still stated that only B-tree and BRIN indexes support
parallel builds.
This commit updates those references to correctly include GIN indexes.
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Discussion: https://postgr.es/m/7d27d068-90e2-4022-9bd7-09b0fd3d4f47@oss.nttdata.com
|
|
The code carelessly modified mtstate->ps.plan->targetlist,
which it's not supposed to do. Fortunately, there's not
really any need to do that because the planner already
set up a perfectly acceptable targetlist for the plan node.
We just need to remove the erroneous assignments and update some
relevant comments.
As it happens, the erroneous assignments caused the targetlist to
point to a different part of the source plan tree, so that there
isn't really a risk of the pointer becoming dangling after executor
termination. The only visible effect of this change we can find is
that EXPLAIN will show upper references to the ModifyTable's output
expressions using different variables. Formerly it showed Vars from
the first target relation that survived executor-startup pruning.
Now it always shows such references using the first relation appearing
in the planner output, independently of what happens during executor
pruning. On the whole that seems like a good thing.
Also make a small tweak in ExplainPreScanNode to ensure that the first
relation will receive a refname assignment in set_rtable_names, even
if it got pruned at startup. Previously the Vars might be shown
without any table qualification, which is confusing in a multi-table
query.
I considered back-patching this, but since the bug doesn't seem to
have any really terrible consequences in existing branches, it
seems better to not change their EXPLAIN output. It's not too late
for v18 though, especially since v18 already made other changes in
the EXPLAIN output for these cases.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Andres Freund <andres@anarazel.de>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/213261.1747611093@sss.pgh.pa.us
|
|
As pointed out by Tom Lane, the patch introduced fragile and invasive
design around plan invalidation handling when locking of prunable
partitions was deferred from plancache.c to the executor. In
particular, it violated assumptions about CachedPlan immutability and
altered executor APIs in ways that are difficult to justify given the
added complexity and overhead.
This also removes the firstResultRels field added to PlannedStmt in
commit 28317de72, which was intended to support deferred locking of
certain ModifyTable result relations.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/605328.1747710381@sss.pgh.pa.us
|
|
When creating a new PlannerGlobal node in standard_planner(), most
fields are explicitly initialized, but a few are not. This doesn't
cause any functional issues, as makeNode() zeroes all fields by
default. However, the inconsistency is undesirable from a clarity and
maintenance perspective.
This patch explicitly initializes the remaining fields to improve
consistency and readability.
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAMbWs4-TgQHNOiouqGcuHoBqbJjWyx4UxGKxUY3FrF4trGbcPA@mail.gmail.com
|
|
When creating an explicit Sort node for the outer path of a mergejoin,
we need to determine the number of presorted keys of the outer path to
decide whether explicit incremental sort can be applied. Currently,
this is done by repeatedly calling pathkeys_count_contained_in.
This patch caches the number of presorted outer pathkeys in MergePath,
allowing us to save several calls to pathkeys_count_contained_in. It
can be considered a complement to the changes in commit 828e94c9d.
Reported-by: David Rowley <dgrowleyml@gmail.com>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/CAApHDvqvBireB_w6x8BN5txdvBEHxVgZBt=rUnpf5ww5P_E_ww@mail.gmail.com
|
|
fc069a3a6319 implemented Self-Join Elimination (SJE) and put related logic
to ChangeVarNodes_walker(). This commit provides refactoring to remove the
SJE-related logic from ChangeVarNodes_walker() but adds a custom callback to
ChangeVarNodesExtended(), which has a chance to process a node before
ChangeVarNodes_walker(). Passing this callback to ChangeVarNodesExtended()
allows SJE-related node handling to be kept within the analyzejoins.c.
Reported-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAMbWs49PE3CvnV8vrQ0Dr%3DHqgZZmX0tdNbzVNJxqc8yg-8kDQQ%40mail.gmail.com
Author: Andrei Lepikhov <lepihov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
|
|
This reverts commit 250a718aadad68793e82103282247556a46a3cfc.
It shouldn't be pushed during the release freeze.
Reported-by: Tom Lane
Discussion: https://postgr.es/m/E1uBIbY-000owH-0O%40gemulon.postgresql.org
|
|
fc069a3a6319 implemented Self-Join Elimination (SJE) and put related logic
to ChangeVarNodes_walker(). This commit provides refactoring to remove the
SJE-related logic from ChangeVarNodes_walker() but adds a custom callback to
ChangeVarNodesExtended(), which has a chance to process a node before
ChangeVarNodes_walker(). Passing this callback to ChangeVarNodesExtended()
allows SJE-related node handling to be kept within the analyzejoins.c.
Reported-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAMbWs49PE3CvnV8vrQ0Dr%3DHqgZZmX0tdNbzVNJxqc8yg-8kDQQ%40mail.gmail.com
Author: Andrei Lepikhov <lepihov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
|
|
fc069a3a6319 implements Self-Join Elimination (SJE), which can remove base
relations when appropriate. However, regressions tests for SJE only cover
the case when placeholder variables (PHVs) are evaluated and needed only
in a single base rel. If this baserel is removed due to SJE, its clauses,
including PHVs, will be transferred to the keeping relation. Removing these
PHVs may trigger an error on plan creation -- thanks to the b3ff6c742f6c for
detecting that.
This commit skips removal of PHVs during SJE. This might also happen that
we skip the removal of some PHVs that could be removed. However, the overhead
of extra PHVs is small compared to the complexity of analysis needed to remove
them.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Alena Rybakina <a.rybakina@postgrespro.ru>
Author: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
|
|
1349d2790 added support so that aggregate functions with an ORDER BY or
DISTINCT clause could make use of presorted inputs to avoid an implicit
sort within nodeAgg.c. That commit failed to consider that a FILTER
clause may exist that filters rows before the aggregate function
arguments are evaluated. That can be problematic if an aggregate
argument contains an expression which could error out during evaluation.
It's perfectly valid to want to have a FILTER clause which eliminates
such values, and with the pre-sorted path added in 1349d2790, it was
possible that the planner would produce a plan with a Sort node above
the Aggregate to perform the sort on the aggregate's arguments long before
the Aggregate node would filter out the non-matching values.
Here we fix this by inspecting ORDER BY / DISTINCT aggregate functions
which have a FILTER clause to see if the aggregate's arguments are
anything more complex than a Var or a Const. Evaluating these isn't
going to cause an error. If we find any non-Var, non-Const parameters
then the planner will now opt to perform the sort in the Aggregate node
for these aggregates, i.e. disable the presorted aggregate optimization.
An alternative fix would have been to completely disallow the presorted
optimization for Aggrefs with any FILTER clause, but that wasn't done as
that could cause large performance regressions for queries that see
significant gains from 1349d2790 due to presorted results coming in from
an Index Scan.
Backpatch to 16, where 1349d2790 was introduced
Author: David Rowley <dgrowleyml@gmail.com>
Reported-by: Kaimeh <kkaimeh@gmail.com>
Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAK-%2BJz9J%3DQ06-M7cDJoPNeYbz5EZDqkjQbJnmRyQyzkbRGsYkA%40mail.gmail.com
Backpatch-through: 16
|
|
When planning queries to partitioned tables, we clone all
EquivalenceMembers belonging to the partitioned table into em_is_child
EquivalenceMembers for each non-pruned partition. For partitioned tables
with large numbers of partitions, this meant the ec_members list could
become large and code searching that list would become slow. Effectively,
the more partitions which were present, the more searches needed to be
performed for operations such as find_ec_member_matching_expr() during
create_plan() and the more partitions present, the longer these searches
would take, i.e., a quadratic slowdown.
To fix this, here we adjust how we store EquivalenceMembers for
em_is_child members. Instead of storing these directly in ec_members,
these are now stored in a new array of Lists in the EquivalenceClass,
which is indexed by the relid. When we want to find EquivalenceMembers
belonging to a certain child relation, we can narrow the search to the
array element for that relation.
To make EquivalenceMember lookup easier and to reduce the amount of code
change, this commit provides a pair of functions to allow iteration over
the EquivalenceMembers of an EC which also handles finding the child
members, if required. Callers that never need to look at child members
can remain using the foreach loop over ec_members, which will now often
be faster due to only parent-level members being stored there.
The actual performance increases here are highly dependent on the number
of partitions and the query being planned. Performance increases can be
visible with as few as 8 partitions, but the speedup is marginal for
such low numbers of partitions. The speedups become much more visible
with a few dozen to hundreds of partitions. With some tested queries
using 56 partitions, the planner was around 3x faster than before. For
use cases with thousands of partitions, these are likely to become
significantly faster. Some testing has shown planner speedups of 60x or
more with 8192 partitions.
Author: Yuya Watari <watari.yuya@gmail.com>
Co-authored-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andrey Lepikhov <a.lepikhov@postgrespro.ru>
Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru>
Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Tested-by: Thom Brown <thom@linux.com>
Tested-by: newtglobal postgresql_contributors <postgresql_contributors@newtglobalcorp.com>
Discussion: https://postgr.es/m/CAJ2pMkZNCgoUKSE%2B_5LthD%2BKbXKvq6h2hQN8Esxpxd%2Bcxmgomg%40mail.gmail.com
|