Age | Commit message (Collapse) | Author |
|
When replica identity is FULL (an admittedly unusual case), the loop
that searches for tuples in execReplication.c didn't stop scanning the
table when once a matching tuple was found. Add the missing 'break'.
Note slight behavior change: we now return the first matching tuple
rather than the last one. They are supposed to be indistinguishable
anyway, so this shouldn't matter.
Author: Konstantin Knizhnik
Discussion: https://postgr.es/m/379743f6-ae91-b866-f7a2-5624e6d2b0a4@postgrespro.ru
|
|
The code checking whether an aggregate transition value needs to be
reparented into the current context has always only compared the
transition return value with the previous transition value by datum,
i.e. without regard for NULLness. This normally works, because when
the transition function returns NULL (via fcinfo->isnull), it'll
return a value that won't be the same as its input value.
But there's no hard requirement that that's the case. And it turns
out, it's possible to hit this case (see discussion or reproducers),
leading to a non-null transition value not being reparented, followed
by a crash caused by that.
Instead of adding another comparison of NULLness, instead have
ExecAggTransReparent() ensure that pergroup->transValue ends up as 0
when the new transition value is NULL. That avoids having to add an
additional branch to the much more common cases of the transition
function returning the old transition value (which is a pointer in
this case), and when the new value is different, but not NULL.
In branches since 69c3936a149, also deduplicate the reparenting code
between the expression evaluation based transitions, and the path for
ordered aggregates.
Reported-By: Teodor Sigaev, Nikita Glukhov
Author: Andres Freund
Discussion: https://postgr.es/m/bd34e930-cfec-ea9b-3827-a8bc50891393@sigaev.ru
Backpatch: 9.4-, this issue has existed since at least 7.4
|
|
Commit 9b63c13f0 turns out to have been fundamentally misguided:
the parent node's subPlan list is by no means the only way in which
a child SubPlan node can be hooked into the outer execution state.
As shown in bug #16213 from Matt Jibson, we can also get short-lived
tuple table slots added to the outer es_tupleTable list. At this point
I have little faith that there aren't other possible connections as
well; the long time it took to notice this problem shows that this
isn't a heavily-exercised situation.
Therefore, revert that fix, returning to the coding that passed a
NULL parent plan pointer down to the transiently-built subexpressions.
That gives us a pretty good guarantee that they won't hook into the
outer executor state in any way. But then we need some other solution
to make SubPlans work. Adopt the solution speculated about in the
previous commit's log message: do expression initialization at plan
startup for just those VALUES rows containing SubPlans, abandoning the
goal of reclaiming memory intra-query for those rows. In practice it
seems unlikely that queries containing a vast number of VALUES rows
would be using SubPlans in them, so this should not give up much.
(BTW, this test case also refutes my claim in connection with the prior
commit that the issue only arises with use of LATERAL. That was just
wrong: some variants of SubLink always produce SubPlans.)
As with previous patch, back-patch to all supported branches.
Discussion: https://postgr.es/m/16213-871ac3bc208ecf23@postgresql.org
|
|
A view with conditional INSTEAD rules and no unconditional INSTEAD
rules or INSTEAD OF triggers is not auto-updatable. Previously we
relied on a check in the executor to catch this, but that's
problematic since the planner may fail to properly handle such a query
and thus return a particularly unhelpful error to the user, before
reaching the executor check.
Instead, trap this in the rewriter and report the correct error there.
Doing so also allows us to include more useful error detail than the
executor check can provide. This doesn't change the existing behaviour
of updatable views; it merely ensures that useful error messages are
reported when a view isn't updatable.
Per report from Pengzhou Tang, though not adopting that suggested fix.
Back-patch to all supported branches.
Discussion: https://postgr.es/m/CAG4reAQn+4xB6xHJqWdtE0ve_WqJkdyCV4P=trYr4Kn8_3_PEA@mail.gmail.com
|
|
Our algorithm for choosing batch numbers turned out not to work
effectively for multi-billion key inner relations. We would use
more hash bits than we have, and effectively concentrate all tuples
into a smaller number of batches than we intended. While ideally
we should switch to wider hashes, for now, change the algorithm to
one that effectively gives up bits from the bucket number when we
don't have enough bits. That means we'll finish up with longer
bucket chains than would be ideal, but that's better than having
batches that don't fit in work_mem and can't be divided.
Batch-patch to all supported releases.
Author: Thomas Munro
Reviewed-by: Tom Lane, thanks also to Tomas Vondra, Alvaro Herrera, Andres Freund for testing and discussion
Reported-by: James Coleman
Discussion: https://postgr.es/m/16104-dc11ed911f1ab9df%40postgresql.org
|
|
Revert part of commit 19df1702f5.
Early shutdown was added by that commit so that we could collect
statistics from workers, but unfortunately, it interacted badly with
rescans. The problem is that we ended up destroying the parallel context
which is required for rescans. This leads to rescans of a Limit node over
a Gather node to produce unpredictable results as it tries to access
destroyed parallel context. By reverting the early shutdown code, we
might lose statistics in some cases of Limit over Gather [Merge], but that
will require further study to fix.
Reported-by: Jerry Sievers
Diagnosed-by: Thomas Munro
Author: Amit Kapila, testcase by Vignesh C
Backpatch-through: 9.6
Discussion: https://postgr.es/m/87ims2amh6.fsf@jsievers.enova.com
|
|
Since WITH CHECK OPTION was introduced, ExecInitModifyTable has
initialized WCO expressions with the wrong plan node as parent -- that is,
it passed its input subplan not the ModifyTable node itself. Up to now
we thought this was harmless, but bug #16006 from Vinay Banakar shows it's
not: if the input node is a SubqueryScan then ExecInitWholeRowVar can get
confused into doing the wrong thing. (The fact that ExecInitWholeRowVar
contains such logic is certainly a horrid kluge that doesn't deserve to
live, but figuring out another way to do that is a task for some other day.)
Andres had already noticed the wrong-parent mistake and fixed it in commit
148e632c0, but not being aware of any user-visible consequences, he quite
reasonably didn't back-patch. This patch is simply a back-patch of
148e632c0, plus addition of a test case based on bug #16006. I also added
the test case to v12/HEAD, even though the bug is already fixed there.
Back-patch to all supported branches. 9.4 lacks RLS policies so the
new test case doesn't work there, but I'm pretty sure a test could be
devised based on using a whole-row Var in a plain WITH CHECK OPTION
condition. (I lack the cycles to do so myself, though.)
Andres Freund and Tom Lane
Discussion: https://postgr.es/m/16006-99290d2e4642cbd5@postgresql.org
Discussion: https://postgr.es/m/20181205225213.hiwa3kgoxeybqcqv@alap3.anarazel.de
|
|
Otherwise the executor can't see trigger transition tables during
EPQ evaluation. Fixes bug #15900 and almost certainly also #15720.
Back-patch to 10, where trigger transition tables landed.
Author: Alex Aktsipetrov
Reviewed-by: Thomas Munro, Tom Lane
Discussion: https://postgr.es/m/15900-bc482754fe8d7415%40postgresql.org
Discussion: https://postgr.es/m/15720-38c2b29e5d720187%40postgresql.org
|
|
The stated reason for acquiring predicate locks on heap pages hasn't
existed since commit c01262a8, so fix the comment. Perhaps in a later
release we'll also be able to change the code to use tuple locks.
Back-patch all the way.
Reviewed-by: Ashwin Agrawal
Discussion: https://postgr.es/m/CAEepm%3D2GK3FVdnt5V3d%2Bh9njWipCv_fNL%3DwjxyUhzsF%3D0PcbNg%40mail.gmail.com
|
|
When this code was initially introduced in commit d1b7c1ff, the structure
used was SharedPlanStateInstrumentation, but later when it got changed to
Instrumentation structure in commit b287df70, we forgot to update the
comment.
Reported-by: Wu Fei
Author: Wu Fei
Reviewed-by: Amit Kapila
Backpatch-through: 9.6
Discussion: https://postgr.es/m/52E6E0843B9D774C8C73D6CF64402F0562215EB2@G08CNEXMBPEKD02.g08.fujitsu.local
|
|
When there were duplicate columns in the hash key list, the array
sizes could be miscomputed, resulting in access off the end of the
array. Adjust the computation to ensure the array is always large
enough.
(I considered whether the duplicates could be removed in planning, but
I can't rule out the possibility that duplicate columns might have
different hash functions assigned. Simpler to just make sure it works
at execution time regardless.)
Bug apparently introduced in fc4b3dea2 as part of narrowing down the
tuples stored in the hashtable. Reported by Colm McHugh of Salesforce,
though I didn't use their patch. Backpatch back to version 10 where
the bug was introduced.
Discussion: https://postgr.es/m/CAFeeJoKKu0u+A_A9R9316djW-YW3-+Gtgvy3ju655qRHR3jtdA@mail.gmail.com
|
|
The es_root_result_relations array needs to be shallow-copied in the
same way as the main es_result_relations array, else EPQ rechecks on
partitioned result relations fail, as seen in bug #15677 from
Norbert Benkocs.
Amit Langote, isolation test case added by me
Discussion: https://postgr.es/m/15677-0bf089579b4cd02d@postgresql.org
Discussion: https://postgr.es/m/19321.1554567786@sss.pgh.pa.us
|
|
Before those commits, partitioning-related code in the executor could
assume that ModifyTableState.resultRelInfo[] contains only leaf partitions.
However, now a fully-pruned update results in a dummy ModifyTable that
references the root partitioned table, and that breaks some stuff.
In v11, this led to an assertion or core dump in the tuple routing code.
Fix by disabling tuple routing, since we don't need that anyway.
(I chose to do that in HEAD as well for safety, even though the problem
doesn't manifest in HEAD as it stands.)
In v10, this confused ExecInitModifyTable's decision about whether it
needed to close the root table. But we can get rid of that altogether
by being smarter about where to find the root table.
Note that since the referenced commits haven't shipped yet, this
isn't fixing any bug the field has seen.
Amit Langote, per a report from me
Discussion: https://postgr.es/m/20710.1554582479@sss.pgh.pa.us
|
|
The issue the comment is referring to was fixed by
08859bb5c2cebc132629ca838113d27bb31b990c.
|
|
Historically we forbade datatype-specific comparison functions from
returning INT_MIN, so that it would be safe to invert the sort order
just by negating the comparison result. However, this was never
really safe for comparison functions that directly return the result
of memcmp(), strcmp(), etc, as POSIX doesn't place any such restriction
on those library functions. Buildfarm results show that at least on
recent Linux on s390x, memcmp() actually does return INT_MIN sometimes,
causing sort failures.
The agreed-on answer is to remove this restriction and fix relevant
call sites to not make such an assumption; code such as "res = -res"
should be replaced by "INVERT_COMPARE_RESULT(res)". The same is needed
in a few places that just directly negated the result of memcmp or
strcmp.
To help find places having this problem, I've also added a compile option
to nbtcompare.c that causes some of the commonly used comparators to
return INT_MIN/INT_MAX instead of their usual -1/+1. It'd likely be
a good idea to have at least one buildfarm member running with
"-DSTRESS_SORT_INT_MIN". That's far from a complete test of course,
but it should help to prevent fresh introductions of such bugs.
This is a longstanding portability hazard, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/20180928185215.ffoq2xrq5d3pafna@alap3.anarazel.de
|
|
The documented shortcoming was actually fixed in 4c728f3829
so the comment is not true anymore.
|
|
In a case where we have multiple relation-scan nodes in a cursor plan,
such as a scan of an inheritance tree, it's possible to fetch from a
given scan node, then rewind the cursor and fetch some row from an
earlier scan node. In such a case, execCurrent.c mistakenly thought
that the later scan node was still active, because ExecReScan hadn't
done anything to make it look not-active. We'd get some sort of
failure in the case of a SeqScan node, because the node's scan tuple
slot would be pointing at a HeapTuple whose t_self gets reset to
invalid by heapam.c. But it seems possible that for other relation
scan node types we'd actually return a valid tuple TID to the caller,
resulting in updating or deleting a tuple that shouldn't have been
considered current. To fix, forcibly clear the ScanTupleSlot in
ExecScanReScan.
Another issue here, which seems only latent at the moment but could
easily become a live bug in future, is that rewinding a cursor does
not necessarily lead to *immediately* applying ExecReScan to every
scan-level node in the plan tree. Upper-level nodes will think that
they can postpone that call if their child node is already marked
with chgParam flags. I don't see a way for that to happen today in
a plan tree that's simple enough for execCurrent.c's search_plan_tree
to understand, but that's one heck of a fragile assumption. So, add
some logic in search_plan_tree to detect chgParam flags being set on
nodes that it descended to/through, and assume that that means we
should consider lower scan nodes to be logically reset even if their
ReScan call hasn't actually happened yet.
Per bug #15395 from Matvey Arye. This has been broken for a long time,
so back-patch to all supported branches.
Discussion: https://postgr.es/m/153764171023.14986.280404050547008575@wrigleys.postgresql.org
|
|
The original coding for XMLTABLE thought it could represent a default
namespace by a T_String Value node with a null string pointer. That's
not okay, though; in particular outfuncs.c/readfuncs.c are not on board
with such a representation, meaning you'll get a null pointer crash
if you try to store a view or rule containing this construct.
To fix, change the parsetree representation so that we have a NULL
list element, instead of a bogus Value node.
This isn't really a functional limitation since default XML namespaces
aren't yet implemented in the executor; you'd just get "DEFAULT
namespace is not supported" anyway. But crashes are not nice, so
back-patch to v10 where this syntax was added. Ordinarily we'd consider
a parsetree representation change to be un-backpatchable; but since
existing releases would crash on the way to storing such constructs,
there can't be any existing views/rules to be incompatible with.
Per report from Andrey Lepikhov.
Discussion: https://postgr.es/m/3690074f-abd2-56a9-144a-aa5545d7a291@postgrespro.ru
|
|
The EvalPlanQual machinery assumes that any initplans (that is,
uncorrelated sub-selects) used during an EPQ recheck would have already
been evaluated during the main query; this is implicit in the fact that
execPlan pointers are not copied into the EPQ estate's es_param_exec_vals.
But it's possible for that assumption to fail, if the initplan is only
reached conditionally. For example, a sub-select inside a CASE expression
could be reached during a recheck when it had not been previously, if the
CASE test depends on a column that was just updated.
This bug is old, appearing to date back to my rewrite of EvalPlanQual in
commit 9f2ee8f28, but was not detected until Kyle Samson reported a case.
To fix, force all not-yet-evaluated initplans used within the EPQ plan
subtree to be evaluated at the start of the recheck, before entering the
EPQ environment. This could be inefficient, if such an initplan is
expensive and goes unused again during the recheck --- but that's piling
one layer of improbability atop another. It doesn't seem worth adding
more complexity to prevent that, at least not in the back branches.
It was convenient to use the new-in-v11 ExecEvalParamExecParams function
to implement this, but I didn't like either its name or the specifics of
its API, so revise that.
Back-patch all the way. Rather than rewrite the patch to avoid depending
on bms_next_member() in the oldest branches, I chose to back-patch that
function into 9.4 and 9.3. (This isn't the first time back-patches have
needed that, and it exhausted my patience.) I also chose to back-patch
some test cases added by commits 71404af2a and 342a1ffa2 into 9.4 and 9.3,
so that the 9.x versions of eval-plan-qual.spec are all the same.
Andrew Gierth diagnosed the problem and contributed the added test cases,
though the actual code changes are by me.
Discussion: https://postgr.es/m/A033A40A-B234-4324-BE37-272279F7B627@tripadvisor.com
|
|
This patch removes two sources of interference between nominally
independent functions when one SPI-using function calls another,
perhaps without knowing that it does so.
Chapman Flack pointed out that xml.c's query_to_xml_internal() expects
SPI_tuptable and SPI_processed to stay valid across datatype output
function calls; but it's possible that such a call could involve
re-entrant use of SPI. It seems likely that there are similar hazards
elsewhere, if not in the core code then in third-party SPI users.
Previously SPI_finish() reset SPI's API globals to zeroes/nulls, which
would typically make for a crash in such a situation. Restoring them
to the values they had at SPI_connect() seems like a considerably more
useful behavior, and it still meets the design goal of not leaving any
dangling pointers to tuple tables of the function being exited.
Also, cause SPI_connect() to reset these variables to zeroes/nulls after
saving them. This prevents interference in the opposite direction: it's
possible that a SPI-using function that's only ever been tested standalone
contains assumptions that these variables start out as zeroes. That was
the case as long as you were the outermost SPI user, but not so much for
an inner user. Now it's consistent.
Report and fix suggestion by Chapman Flack, actual patch by me.
Back-patch to all supported branches.
Discussion: https://postgr.es/m/9fa25bef-2e4f-1c32-22a4-3ad0723c4a17@anastigmatix.net
|
|
When executing a SubPlan in an expression, the EState's direction
field was left alone, resulting in an attempt to execute the subplan
backwards if it was encountered during a backwards scan of a cursor.
Also, though much less likely, it was possible to reach the execution
of an InitPlan while in backwards-scan state.
Repair by saving/restoring estate->es_direction and forcing forward
scan mode in the relevant places.
Backpatch all the way, since this has been broken since 8.3 (prior to
commit c7ff7663e, SubPlans had their own EStates rather than sharing
the parent plan's, so there was no confusion over scan direction).
Per bug #15336 reported by Vladimir Baranoff; analysis and patch by
me, review by Tom Lane.
Discussion: https://postgr.es/m/153449812167.1304.1741624125628126322@wrigleys.postgresql.org
|
|
After commits a315b967cc and b805b63ac2, part of the comment atop
ExecShutdownNode is redundant. Adjust it.
Author: Amit Kapila
Backpatch-through: 10 where both the mentioned commits are present.
Discussion: https://postgr.es/m/86137f17-1dfb-42f9-7421-82fd786b04a1@anayrat.info
|
|
Currently, we release the asynchronous resources as soon as it is evident
that no more rows will be needed e.g. when a Limit is filled. This can be
problematic especially for custom and foreign scans where we can scan
backward. Fix that by disallowing the shutting down of resources in such
cases.
Reported-by: Robert Haas
Analysed-by: Robert Haas and Amit Kapila
Author: Amit Kapila
Reviewed-by: Robert Haas
Backpatch-through: 9.6 where this code was introduced
Discussion: https://postgr.es/m/86137f17-1dfb-42f9-7421-82fd786b04a1@anayrat.info
|
|
Multiple calls to XMLTABLE in a query (e.g. laterally applying it to a
table with an xml column, an important use-case) were leaking large
amounts of memory into the per-query context, blowing up memory usage.
Repair by reorganizing memory context usage in nodeTableFuncscan; use
the usual per-tuple context for row-by-row evaluations instead of
perValueCxt, and use the explicitly created context -- renamed from
perValueCxt to perTableCxt -- for arguments and state for each
individual table-generation operation.
Backpatch to PG10 where this code was introduced.
Original report by IRC user begriffs; analysis and patch by me.
Reviewed by Tom Lane and Pavel Stehule.
Discussion: https://postgr.es/m/153394403528.10284.7530399040974170549@wrigleys.postgresql.org
|
|
The buffer usage stats is accounted only for the execution phase of the
node. For Gather and Gather Merge nodes, such stats are accumulated at
the time of shutdown of workers which is done after execution of node due
to which we missed to account them for such nodes. Fix it by treating
nodes as running while we shut down them.
We can also miss accounting for a Limit node when Gather or Gather Merge
is beneath it, because it can finish the execution before shutting down
such nodes. So we allow a Limit node to shut down the resources before it
completes the execution.
In the passing fix the gather node code to allow workers to shut down as
soon as we find that all the tuples from the workers have been retrieved.
The original code use to do that, but is accidently removed by commit
01edb5c7fc.
Reported-by: Adrien Nayrat
Author: Amit Kapila and Robert Haas
Reviewed-by: Robert Haas and Andres Freund
Backpatch-through: 9.6 where this code was introduced
Discussion: https://postgr.es/m/86137f17-1dfb-42f9-7421-82fd786b04a1@anayrat.info
|
|
In the leader backend, we don't track the buffer usage for ExecutorStart
phase whereas in worker backend we track it for ExecutorStart phase as
well. This leads to different value for buffer usage stats for the
parallel and non-parallel query. Change the code so that worker backend
also starts tracking buffer usage after ExecutorStart.
Author: Amit Kapila and Robert Haas
Reviewed-by: Robert Haas and Andres Freund
Backpatch-through: 9.6 where this code was introduced
Discussion:https://postgr.es/m/86137f17-1dfb-42f9-7421-82fd786b04a1@anayrat.info
|
|
I was confused by what "intended to be parallel serially" meant, until
Robert Haas and David G. Johnston explained it. Rephrase the comment to
make it more clear, using David's suggested wording.
Discussion: https://www.postgresql.org/message-id/1fec9022-41e8-e484-70ce-2179b08c2092%40iki.fi
|
|
A collection of typos I happened to spot while reading code, as well as
grepping for common mistakes.
Backpatch to all supported versions, as applicable, to avoid conflicts
when backporting other commits in the future.
|
|
A slot can not be stored in a tuple but it's vice versa.
Reported-by: Ashutosh Bapat
Author: Ashutosh Bapat
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CAFjFpRcHhNhXdegyJv3KKDWrwO1_NB_KYZM_ZSDeMOZaL1A5jQ@mail.gmail.com
|
|
|
|
The 'orig_slot' argument was removed in commit c0a8ae7be392, but that
commit forgot to update the comment.
Author: Amit Langote
Discussion: https://www.postgresql.org/message-id/194ac4bf-7b4a-c887-bf26-bc1a85ea995a@lab.ntt.co.jp
|
|
The impact of VARIADIC on the combine/serialize/deserialize support
functions of an aggregate wasn't thought through carefully. There is
actually no impact, because variadicity isn't passed through to these
functions (and it doesn't seem like it would need to be). However,
lookup_agg_function was mistakenly told to check things as though it were
passed through. The net result was that it was impossible to declare an
aggregate that had both VARIADIC input and parallelism support functions.
In passing, fix a runtime check in nodeAgg.c for the combine function's
strictness to make its error message agree with the creation-time check.
The previous message was actually backwards, and it doesn't seem like
there's a good reason to have two versions of this message text anyway.
Back-patch to 9.6 where parallel aggregation was introduced.
Alexey Bashtanov; message fix by me
Discussion: https://postgr.es/m/f86dde87-fef4-71eb-0480-62754aaca01b@imap.cc
|
|
We make some changes to ModifyTableState and the EState it uses whenever
we route tuples to partitions; but we weren't restoring properly in all
cases, possibly causing crashes when partitions with different tuple
descriptors are targeted by tuples inserted in the same command.
Refactor some code, creating ExecPrepareTupleRouting, to encapsulate the
needed state changing logic, and have it invoked one level above its
current place (ie. put it in ExecModifyTable instead of ExecInsert);
this makes it all more readable.
Add a test case to exercise this.
We don't support having views as partitions; and since only views can
have INSTEAD OF triggers, there is no point in testing for INSTEAD OF
when processing insertions into a partitioned table. Remove code that
appears to support this (but which is actually never relevant.)
In passing, fix location of some very confusing comments in
ModifyTableState.
Reported-by: Amit Langote
Author: Etsuro Fujita, Amit Langote
Discussion: https://postgr/es/m/0473bf5c-57b1-f1f7-3d58-455c2230bc5f@lab.ntt.co.jp
|
|
"UPDATE/DELETE WHERE CURRENT OF cursor_name" failed, with an error message
like "cannot extract system attribute from virtual tuple", if the cursor
was using a index-only scan for the target table. Fix it by digging the
current TID out of the indexscan state.
It seems likely that the same failure could occur for CustomScan plans
and perhaps some FDW plan types, so that leaving this to be treated as an
internal error with an obscure message isn't as good an idea as it first
seemed. Hence, add a bit of heaptuple.c infrastructure to let us deliver
a more on-topic message. I chose to make the message match what you get
for the case where execCurrentOf can't identify the target scan node at
all, "cursor "foo" is not a simply updatable scan of table "bar"".
Perhaps it should be different, but we can always adjust that later.
In the future, it might be nice to provide hooks that would let custom
scan providers and/or FDWs deal with this in other ways; but that's
not a suitable topic for a back-patchable bug fix.
It's been like this all along, so back-patch to all supported branches.
Yugo Nagata and Tom Lane
Discussion: https://postgr.es/m/20180201013349.937dfc5f.nagata@sraoss.co.jp
|
|
ExecHashTableCreate allocated some memory that wasn't freed by
ExecHashTableDestroy, specifically the per-hash-key function information.
That's not a huge amount of data, but if one runs a query that repeats
a hash join enough times, it builds up. Fix by arranging for the data
in question to be kept in the hashtable's hashCxt instead of leaving it
"loose" in the query-lifespan executor context. (This ensures that we'll
also clean up anything that the hash functions allocate in fn_mcxt.)
Per report from Amit Khandekar. It's been like this forever, so back-patch
to all supported branches.
Discussion: https://postgr.es/m/CAJ3gD9cFofAWGvcxLOxDHC=B0hjtW8yGmUsF2hdGh97CM38=7g@mail.gmail.com
|
|
Tom Kazimiers reported that transition tables don't work correctly when
they are scanned by more than one executor node. That's because commit
18ce3a4ab allocated separate read pointers for each executor node, as it
must, but failed to make them active at the appropriate times. Repair.
Thomas Munro
Discussion: https://postgr.es/m/20180224034748.bixarv6632vbxgeb%40dewberry.localdomain
|
|
An updating query that reads a CTE within an InitPlan or SubPlan could get
incorrect results if it updates rows that are concurrently being modified.
This is caused by CteScanNext supposing that nothing inside its recursive
ExecProcNode call could change which read pointer is selected in the CTE's
shared tuplestore. While that's normally true because of scoping
considerations, it can break down if an EPQ plan tree gets built during the
call, because EvalPlanQualStart builds execution trees for all subplans
whether they're going to be used during the recheck or not. And it seems
like a pretty shaky assumption anyway, so let's just reselect our own read
pointer here.
Per bug #14870 from Andrei Gorita. This has been broken since CTEs were
implemented, so back-patch to all supported branches.
Discussion: https://postgr.es/m/20171024155358.1471.82377@wrigleys.postgresql.org
|
|
The changes in b81b5a96f424531b97cdd1dba97d9d1b9c9d372e did not fully
address the issue, because the bit-mixing of the IV into the final
hash-key didn't prevent clustering in the input-data survive in the
output data.
This didn't cause a lot of problems because of the additional growth
conditions added d4c62a6b623d6eef88218158e9fa3cf974c6c7e5. But as we
want to rein those in due to explosive growth in some edges, this
needs to be fixed.
Author: Andres Freund
Discussion: https://postgr.es/m/20171127185700.1470.20362@wrigleys.postgresql.org
Backpatch: 10, where simplehash was introduced
|
|
Commit 09529a70b changed nodeIndexscan.c and nodeIndexonlyscan.c to
postpone initialization of the indexscan proper until the first tuple
fetch. It overlooked the question of mark/restore behavior, which means
that if some caller attempts to mark the scan before the first tuple fetch,
you get a null pointer dereference.
The only existing user of mark/restore is nodeMergejoin.c, which (somewhat
accidentally) will never attempt to set a mark before the first inner tuple
unless the inner child node is a Material node. Hence the case can't arise
normally, so it seems sufficient to document the assumption at both ends.
However, during an EvalPlanQual recheck, ExecScanFetch doesn't call
IndexNext but just returns the jammed-in test tuple. Therefore, if we're
doing a recheck in a plan tree with a mergejoin with inner indexscan,
it's possible to reach ExecIndexMarkPos with iss_ScanDesc still null,
as reported by Guo Xiang Tan in bug #15032.
Really, when there's a test tuple supplied during an EPQ recheck, touching
the index at all is the wrong thing: rather, the behavior of mark/restore
ought to amount to saving and restoring the es_epqScanDone flag. We can
avoid finding a place to actually save the flag, for the moment, because
given the assumption that no caller will set a mark before fetching a
tuple, es_epqScanDone must always be set by the time we try to mark.
So the actual behavior change required is just to not reach the index
access if a test tuple is supplied.
The set of plan node types that need to consider this issue are those
that support EPQ test tuples (i.e., call ExecScan()) and also support
mark/restore; which is to say, IndexScan, IndexOnlyScan, and perhaps
CustomScan. It's tempting to try to fix the problem in one place by
teaching ExecMarkPos() itself about EPQ; but ExecMarkPos supports some
plan types that aren't Scans, and also it seems risky to make assumptions
about what a CustomScan wants to do here. Also, the most likely future
change here is to decide that we do need to support marks placed before
the first tuple, which would require additional work in IndexScan and
IndexOnlyScan in any case. Hence, fix the EPQ issue in nodeIndexscan.c
and nodeIndexonlyscan.c, accepting the small amount of code duplicated
thereby, and leave it to CustomScan providers to fix this bug if they
have it.
Back-patch to v10 where commit 09529a70b came in. In earlier branches,
the index_markpos() call is a waste of cycles when EPQ is active, but
no more than that, so it doesn't seem appropriate to back-patch further.
Discussion: https://postgr.es/m/20180126074932.3098.97815@wrigleys.postgresql.org
|
|
Otherwise, when the query string is read, we might trailing garbage
beyond the end, unless there happens to be a \0 there by good luck.
Report and patch by Thomas Munro. Reviewed by Rafia Sabih.
Discussion: http://postgr.es/m/CAEepm=2SJs7X+_vx8QoDu8d1SMEOxtLhxxLNzZun_BvNkuNhrw@mail.gmail.com
|
|
When a Gather or Gather Merge node is started and stopped multiple
times, accumulate instrumentation data only once, at the end, instead
of after each execution, to avoid recording inflated totals.
Commit 778e78ae9fa51e58f41cbdc72b293291d02d8984, the previous attempt
at a fix, instead reset the state after every execution, which worked
for the general instrumentation data but had problems for the additional
instrumentation specific to Sort and Hash nodes.
Report by hubert depesz lubaczewski. Analysis and fix by Amit Kapila,
following a design proposal from Thomas Munro, with a comment tweak
by me.
Discussion: http://postgr.es/m/20171127175631.GA405@depesz.com
|
|
es_query_dsa turns out to be broken by design, because it supposes
that there is only one DSA for the whole query, whereas there is
actually one per Gather (Merge) node. For now, work around that
problem by setting and clearing the pointer around the sections of
code that might need it. It's probably a better idea to get rid of
es_query_dsa altogether in favor of having each node keep track
individually of which DSA is relevant, but that seems like more than
we would want to back-patch.
Thomas Munro, reviewed and tested by Andreas Seltenreich, Amit
Kapila, and by me.
Discussion: http://postgr.es/m/CAEepm=1U6as=brnVvMNixEV2tpi8NuyQoTmO8Qef0-VV+=7MDA@mail.gmail.com
|
|
This reverts commit 778e78ae9fa51e58f41cbdc72b293291d02d8984. Per
further discussion, that doesn't seem to be the best possible fix.
Discussion: http://postgr.es/m/CAA4eK1LW2aFKzY3=vwvc=t-juzPPVWP2uT1bpx_MeyEqnM+p8g@mail.gmail.com
|
|
Reported-by: Noah Misch <noah@leadboat.com>
|
|
I noticed that _SPI_execute_plan initially sets spierrcontext.arg = NULL,
and only fills it in some time later. If an error were to happen in
between, _SPI_error_callback would try to dereference the null pointer.
This is unlikely --- there's not much between those points except
push-snapshot calls --- but it's clearly not impossible. Tweak the
callback to do nothing if the pointer isn't set yet.
It's been like this for awhile, so back-patch to all supported branches.
|
|
When a Gather or Gather Merge node is started and stopped multiple
times, the old code wouldn't reset the shared state between executions,
potentially resulting in dramatically inflated instrumentation data
for nodes beneath it. (The per-worker instrumentation ended up OK,
I think, but the overall totals were inflated.)
Report by hubert depesz lubaczewski. Analysis and fix by Amit Kapila,
reviewed and tweaked a bit by me.
Discussion: http://postgr.es/m/20171127175631.GA405@depesz.com
|
|
If we have a plan that uses parallelism but are unable to execute it
using parallelism, for example due to a lack of available DSM
segments, then the EState's es_query_dsa will be NULL. Parallel
bitmap heap scan needs to fall back to a non-parallel scan in such
cases.
Patch by me, reviewed by Dilip Kumar
Discussion: http://postgr.es/m/CAEepm=0kADK5inNf_KuemjX=HQ=PuTP0DykM--fO5jS5ePVFEA@mail.gmail.com
|
|
rewriteTargetListUD's processing is dependent on the relkind of the query's
target table. That was fine at the time it was made to act that way, even
for queries on inheritance trees, because all tables in an inheritance tree
would necessarily be plain tables. However, the 9.5 feature addition
allowing some members of an inheritance tree to be foreign tables broke the
assumption that rewriteTargetListUD's output tlist could be applied to all
child tables with nothing more than column-number mapping. This led to
visible failures if foreign child tables had row-level triggers, and would
also break in cases where child tables belonged to FDWs that used methods
other than CTID for row identification.
To fix, delay running rewriteTargetListUD until after the planner has
expanded inheritance, so that it is applied separately to the (already
mapped) tlist for each child table. We can conveniently call it from
preprocess_targetlist. Refactor associated code slightly to avoid the
need to heap_open the target relation multiple times during
preprocess_targetlist. (The APIs remain a bit ugly, particularly around
the point of which steps scribble on parse->targetList and which don't.
But avoiding such scribbling would require a change in FDW callback APIs,
which is more pain than it's worth.)
Also fix ExecModifyTable to ensure that "tupleid" is reset to NULL when
we transition from rows providing a CTID to rows that don't. (That's
really an independent bug, but it manifests in much the same cases.)
Add a regression test checking one manifestation of this problem, which
was that row-level triggers on a foreign child table did not work right.
Back-patch to 9.5 where the problem was introduced.
Etsuro Fujita, reviewed by Ildus Kurbangaliev and Ashutosh Bapat
Discussion: https://postgr.es/m/20170514150525.0346ba72@postgrespro.ru
|
|
When nodeValuesscan.c was written, it was impossible to have a SubPlan in
VALUES --- any sub-SELECT there would have to be uncorrelated and thereby
would produce an InitPlan instead. We therefore took a shortcut in the
logic that throws away a ValuesScan's per-row expression evaluation data
structures. This was broken by the introduction of LATERAL however; a
sub-SELECT containing a lateral reference produces a correlated SubPlan.
The cleanest fix for this would be to give up the optimization of
discarding the expression eval state. But that still seems pretty
unappetizing for long VALUES lists. It seems to work to just prevent
the subexpressions from hooking into the ValuesScan node's subPlan
list, so let's do that and see how well it works. (If this breaks,
due to additional connections between the subexpressions and the outer
query structures, we might consider compromises like throwing away data
only for VALUES rows not containing SubPlans.)
Per bug #14924 from Christian Duta. Back-patch to 9.3 where LATERAL
was introduced.
Discussion: https://postgr.es/m/20171124120836.1463.5310@wrigleys.postgresql.org
|
|
When strict aggregate combine functions, used in multi-stage/parallel
aggregation, returned NULL, we didn't check for that, invoking the
combine function with NULL the next round, despite it being strict.
The equivalent code invoking normal transition functions has a check
for that situation, which did not get copied in a7de3dc5c346. Fix the
bug by adding the equivalent check.
Based on a quick look I could not find any strict combine functions in
core actually returning NULL, and it doesn't seem very likely external
users have done so. So this isn't likely to have caused issues in
practice.
Add tests verifying transition / combine functions returning NULL is
tested.
Reported-By: Andres Freund
Author: Andres Freund
Discussion: https://postgr.es/m/20171121033642.7xvmjqrl4jdaaat3@alap3.anarazel.de
Backpatch: 9.6, where parallel aggregation was introduced
|