| Age | Commit message (Collapse) | Author |
|
|
|
Instead of serializing the transaction to disk after reaching the
logical_decoding_work_mem limit in memory, we consume the changes we have
in memory and invoke stream API methods added by commit 45fdc9738b.
However, sometimes if we have incomplete toast or speculative insert we
spill to the disk because we can't generate the complete tuple and stream.
And, as soon as we get the complete tuple we stream the transaction
including the serialized changes.
We can do this incremental processing thanks to having assignments
(associating subxact with toplevel xacts) in WAL right away, and
thanks to logging the invalidation messages at each command end. These
features are added by commits 0bead9af48 and c55040ccd0 respectively.
Now that we can stream in-progress transactions, the concurrent aborts
may cause failures when the output plugin consults catalogs (both system
and user-defined).
We handle such failures by returning ERRCODE_TRANSACTION_ROLLBACK
sqlerrcode from system table scan APIs to the backend or WALSender
decoding a specific uncommitted transaction. The decoding logic on the
receipt of such a sqlerrcode aborts the decoding of the current
transaction and continue with the decoding of other transactions.
We have ReorderBufferTXN pointer in each ReorderBufferChange by which we
know which xact it belongs to. The output plugin can use this to decide
which changes to discard in case of stream_abort_cb (e.g. when a subxact
gets discarded).
We also provide a new option via SQL APIs to fetch the changes being
streamed.
Author: Dilip Kumar, Tomas Vondra, Amit Kapila, Nikhil Sontakke
Reviewed-by: Amit Kapila, Kuntal Ghosh, Ajin Cherian
Tested-by: Neha Sharma, Mahendra Singh Thalor and Ajin Cherian
Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
|
|
These flags are unused and always have been.
Discussion: https://postgr.es/m/20200805235549.GA8118@alvherre.pgsql
|
|
Post-commit review for commit 84c0e4b9.
Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvptBx_%2BUPAzY0uXzopbvPVGKPeZ6Hoy8rnPcWz20Cr0Bw%40mail.gmail.com
|
|
Windows 64bit has 4-byte long values which is not suitable for tracking
disk space usage in the incremental sort code. Let's just make all these
fields int64s.
Author: James Coleman
Discussion: https://postgr.es/m/CAApHDvpky%2BUhof8mryPf5i%3D6e6fib2dxHqBrhp0Qhu0NeBhLJw%40mail.gmail.com
Backpatch-through: 13, where the incremental sort code was added
|
|
This allows AM-specific knowledge to be applied during creation of
pg_amop and pg_amproc entries. Specifically, the AM knows better than
core code which entries to consider as required or optional. Giving
the latter entries the appropriate sort of dependency allows them to
be dropped without taking out the whole opclass or opfamily; which
is something we'd like to have to correct obsolescent entries in
extensions.
This callback also opens the door to performing AM-specific validity
checks during opclass creation, rather than hoping than an opclass
developer will remember to test with "amvalidate". For the most part
I've not actually added any such checks yet; that can happen in a
follow-on patch. (Note that we shouldn't remove any tests from
"amvalidate", as those are still needed to cross-check manually
constructed entries in the initdb data. So adding tests to
"amadjustmembers" will be somewhat duplicative, but it seems like
a good idea anyway.)
Patch by me, reviewed by Alexander Korotkov, Hamid Akhtar, and
Anastasia Lubennikova.
Discussion: https://postgr.es/m/4578.1565195302@sss.pgh.pa.us
|
|
When reading the code it's not obvious when one should prefer dynahash
over simplehash and vice-versa, so, for programmer-friendliness, add
comments to inform that decision.
Show sample simplehash method signatures.
Author: James Coleman <jtc331@gmail.com>
Discussion: https://postgr.es/m/CAAaqYe_dOF39gAJ8rL-a3YO3Qo96MHMRQ2whFjK5ZcU6YvMQSA%40mail.gmail.com
|
|
Create an optional region in the main shared memory segment that can be
used to acquire and release "fast" DSM segments, and can benefit from
huge pages allocated at cluster startup time, if configured. Fall back
to the existing mechanisms when that space is full. The size is
controlled by a new GUC min_dynamic_shared_memory, defaulting to 0.
Main region DSM segments initially contain whatever garbage the memory
held last time they were used, rather than zeroes. That change revealed
that DSA areas failed to initialize themselves correctly in memory that
wasn't zeroed first, so fix that problem.
Discussion: https://postgr.es/m/CA%2BhUKGLAE2QBv-WgGp%2BD9P_J-%3Dyne3zof9nfMaqq1h3EGHFXYQ%40mail.gmail.com
|
|
local_blks_dirtied tracks the number of local blocks dirtied, not shared
ones.
Author: Kirk Jamison
Discussion: https://postgr.es/m/OSBPR01MB2341760686DC056DE89D2AB9EF710@OSBPR01MB2341.jpnprd01.prod.outlook.com
|
|
Avoid repeatedly calling lseek(SEEK_END) during recovery by caching
the size of each fork. For now, we can't use the same technique in
other processes, because we lack a shared invalidation mechanism.
Do this by generalizing the pre-existing caching used by FSM and VM
to support all forks.
Discussion: https://postgr.es/m/CAEepm%3D3SSw-Ty1DFcK%3D1rU-K6GSzYzfdD4d%2BZwapdN7dTa6%3DnQ%40mail.gmail.com
|
|
For pg_attribute, this allows to insert at once a full set of attributes
for a relation (roughly 15% of WAL reduction in extreme cases). For
pg_shdepend, this reduces the work done when creating new shared
dependencies from a database template. The number of slots used for the
insertion is capped at 64kB of data inserted for both, depending on the
number of items to insert and the length of the rows involved.
More can be done for other catalogs, like pg_depend. This part requires
a different approach as the number of slots to use depends also on the
number of entries discarded as pinned dependencies. This is also
related to the rework or dependency handling for ALTER TABLE and CREATE
TABLE, mainly.
Author: Daniel Gustafsson
Reviewed-by: Andres Freund, Michael Paquier
Discussion: https://postgr.es/m/20190213182737.mxn6hkdxwrzgxk35@alap3.anarazel.de
|
|
Create LatchWaitSet at backend startup time, and use it to implement
WaitLatch(). This avoids repeated epoll/kqueue setup and teardown
system calls.
Reorder SubPostmasterMain() slightly so that we restore the postmaster
pipe and Windows signal emulation before we reach InitPostmasterChild(),
to make this work in EXEC_BACKEND builds.
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGJAC4Oqao%3DqforhNey20J8CiG2R%3DoBPqvfR0vOJrFysGw%40mail.gmail.com
|
|
Add a GUC that acts as a multiplier on work_mem. It gets applied when
sizing executor node hash tables that were previously size constrained
using work_mem alone.
The new GUC can be used to preferentially give hash-based nodes more
memory than the generic work_mem limit. It is intended to enable admin
tuning of the executor's memory usage. Overall system throughput and
system responsiveness can be improved by giving hash-based executor
nodes more memory (especially over sort-based alternatives, which are
often much less sensitive to being memory constrained).
The default value for hash_mem_multiplier is 1.0, which is also the
minimum valid value. This means that hash-based nodes continue to apply
work_mem in the traditional way by default.
hash_mem_multiplier is generally useful. However, it is being added now
due to concerns about hash aggregate performance stability for users
that upgrade to Postgres 13 (which added disk-based hash aggregation in
commit 1f39bce0). While the old hash aggregate behavior risked
out-of-memory errors, it is nevertheless likely that many users actually
benefited. Hash agg's previous indifference to work_mem during query
execution was not just faster; it also accidentally made aggregation
resilient to grouping estimate problems (at least in cases where this
didn't create destabilizing memory pressure).
hash_mem_multiplier can provide a certain kind of continuity with the
behavior of Postgres 12 hash aggregates in cases where the planner
incorrectly estimates that all groups (plus related allocations) will
fit in work_mem/hash_mem. This seems necessary because hash-based
aggregation is usually much slower when only a small fraction of all
groups can fit. Even when it isn't possible to totally avoid hash
aggregates that spill, giving hash aggregation more memory will reliably
improve performance (the same cannot be said for external sort
operations, which appear to be almost unaffected by memory availability
provided it's at least possible to get a single merge pass).
The PostgreSQL 13 release notes should advise users that increasing
hash_mem_multiplier can help with performance regressions associated
with hash aggregation. That can be taken care of by a later commit.
Author: Peter Geoghegan
Reviewed-By: Álvaro Herrera, Jeff Davis
Discussion: https://postgr.es/m/20200625203629.7m6yvut7eqblgmfo@alap3.anarazel.de
Discussion: https://postgr.es/m/CAH2-WzmD%2Bi1pG6rc1%2BCjc4V6EaFJ_qSuKCCHVnH%3DoruqD-zqow%40mail.gmail.com
Backpatch: 13-, where disk-based hash aggregation was introduced.
|
|
When fast promotion was supported in 9.3, non-fast promotion became
undocumented feature and it's basically not available for ordinary users.
However we decided not to remove non-fast promotion at that moment,
to leave it for a release or two for debugging purpose or as an emergency
method because fast promotion might have some issues, and then to
remove it later. Now, several versions were released since that decision
and there is no longer reason to keep supporting non-fast promotion.
Therefore this commit removes non-fast promotion.
Author: Fujii Masao
Reviewed-by: Hamid Akhtar, Kyotaro Horiguchi
Discussion: https://postgr.es/m/76066434-648f-f567-437b-54853b43398f@oss.nttdata.com
|
|
Use HyperLogLog to estimate the group cardinality in a spilled
partition. This estimate is used to choose the number of partitions if
we recurse.
The previous behavior was to use the number of tuples in a spilled
partition as the estimate for the number of groups, which lead to
overpartitioning. That could cause the number of batches to be much
higher than expected (with each batch being very small), which made it
harder to interpret EXPLAIN ANALYZE results.
Reviewed-by: Peter Geoghegan
Discussion: https://postgr.es/m/a856635f9284bc36f7a77d02f47bbb6aaf7b59b3.camel@j-davis.com
Backpatch-through: 13
|
|
Since the tableam.c code needs to make use of the syncscan.c routines
itself, and since other block-oriented AMs might also want to use it one
day, it didn't make sense for it to live under src/backend/access/heap.
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKGLCnG%3DNEAByg6bk%2BCT9JZD97Y%3DAxKhh27Su9FeGWOKvDg%40mail.gmail.com
|
|
This adds seven methods to the output plugin API, adding support for
streaming changes of large in-progress transactions.
* stream_start
* stream_stop
* stream_abort
* stream_commit
* stream_change
* stream_message
* stream_truncate
Most of this is a simple extension of the existing methods, with
the semantic difference that the transaction (or subtransaction)
is incomplete and may be aborted later (which is something the
regular API does not really need to deal with).
This also extends the 'test_decoding' plugin, implementing these
new stream methods.
The stream_start/start_stop are used to demarcate a chunk of changes
streamed for a particular toplevel transaction.
This commit simply adds these new APIs and the upcoming patch to "allow
the streaming mode in ReorderBuffer" will use these APIs.
Author: Tomas Vondra, Dilip Kumar, Amit Kapila
Reviewed-by: Amit Kapila
Tested-by: Neha Sharma and Mahendra Singh Thalor
Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
|
|
Note: This GUC was originally named enable_hashagg_disk when it appeared
in commit 1f39bce0, which added disk-based hash aggregation. It was
subsequently renamed in commit 92c58fd9.
Author: Peter Geoghegan
Reviewed-By: Jeff Davis, Álvaro Herrera
Discussion: https://postgr.es/m/9d9d1e1252a52ea1bad84ea40dbebfd54e672a0f.camel%40j-davis.com
Backpatch: 13-, where disk-based hash aggregation was introduced.
|
|
Refactor hash lookups in nodeAgg.c to improve performance.
Author: Andres Freund and Jeff Davis
Discussion: https://postgr.es/m/20200612213715.op4ye4q7gktqvpuo%40alap3.anarazel.de
Backpatch-through: 13
|
|
Previously we would allocate blocks to parallel workers during a parallel
sequential scan 1 block at a time. Since other workers were likely to
request a block before a worker returns for another block number to work
on, this could lead to non-sequential I/O patterns in each worker which
could cause the operating system's readahead to perform poorly or not at
all.
Here we change things so that we allocate consecutive "chunks" of blocks
to workers and have them work on those until they're done, at which time
we allocate another chunk for the worker. The size of these chunks is
based on the size of the relation.
Initial patch here was by Thomas Munro which showed some good improvements
just having a fixed chunk size of 64 blocks with a simple ramp-down near
the end of the scan. The revisions of the patch to make the chunk size
based on the relation size and the adjusted ramp-down in powers of two was
done by me, along with quite extensive benchmarking to determine the
optimal chunk sizes.
For the most part, benchmarks have shown significant performance
improvements for large parallel sequential scans on Linux, FreeBSD and
Windows using SSDs. It's less clear how this affects the performance of
cloud providers. Tests done so far are unable to obtain stable enough
performance to provide meaningful benchmark results. It is possible that
this could cause some performance regressions on more obscure filesystems,
so we may need to later provide users with some ability to get something
closer to the old behavior. For now, let's leave that until we see that
it's really required.
Author: Thomas Munro, David Rowley
Reviewed-by: Ranier Vilela, Soumyadeep Chakraborty, Robert Haas
Reviewed-by: Amit Kapila, Kirk Jamison
Discussion: https://postgr.es/m/CA+hUKGJ_EErDv41YycXcbMbCBkztA34+z1ts9VQH+ACRuvpxig@mail.gmail.com
|
|
Making these leakproof seems helpful since (for example) if you have a
function f(int8) that is leakproof, you don't want it to effectively
become non-leakproof when you apply it to an int4 or int2 column.
But that's what happens today, since the implicit up-coercion will
not be leakproof.
Most of the coercion functions that visibly can't throw errors are
functions that convert numeric datatypes to other, wider ones.
Notable is that float4_numeric and float8_numeric can be marked
leakproof; before commit a57d312a7 they could not have been.
I also marked the functions that coerce strings to "name" as leakproof;
that's okay today because they truncate silently, but if we ever
reconsidered that behavior then they could no longer be leakproof.
I desisted from marking rtrim1() as leakproof; it appears so right now,
but the code seems a little too complex and perhaps subject to change,
since it's shared with other SQL functions.
Discussion: https://postgr.es/m/459322.1595607431@sss.pgh.pa.us
|
|
It's fairly silly that ignoring NOT subexpressions is TS_execute's
default behavior. It's wrong on its face and it encourages errors
of omission. Moreover, the only two remaining callers that aren't
specifying CALC_NOT are in ts_headline calculations, and it's very
arguable that those are bugs: if you've specified "!foo" in your
query, why would you want to get a headline that includes "foo"?
Hence, rip that out and change the default behavior to be to calculate
NOT accurately. As a concession to the slim chance that there is still
somebody somewhere who needs the incorrect behavior, provide a new
SKIP_NOT flag to explicitly request that.
Back-patch into v13, mainly because it seems better to change this
at the same time as the previous commit's rejiggering of TS_execute
related APIs. Any outside callers affected by this change are
probably also affected by that one.
Discussion: https://postgr.es/m/CALT9ZEE-aLotzBg-pOp2GFTesGWVYzXA3=mZKzRDa_OKnLF7Mg@mail.gmail.com
|
|
Text search sometimes failed to find valid matches, for instance
'!crew:A'::tsquery might fail to locate 'crew:1B'::tsvector during
an index search. The root of the issue is that TS_execute's callback
functions were not changed to use ternary (yes/no/maybe) reporting
when we made the search logic itself do so. It's somewhat annoying
to break that API, but on the other hand we now see that any code
using plain boolean logic is almost certainly broken since the
addition of phrase search. There seem to be very few outside callers
of this code anyway, so we'll just break them intentionally to get
them to adapt.
This allows removal of tsginidx.c's private re-implementation of
TS_execute, since that's now entirely duplicative. It's also no
longer necessary to avoid use of CALC_NOT in tsgistidx.c, since
the underlying callbacks can now do something reasonable.
Back-patch into v13. We can't change this in stable branches,
but it seems not quite too late to fix it in v13.
Tom Lane and Pavel Borisov
Discussion: https://postgr.es/m/CALT9ZEE-aLotzBg-pOp2GFTesGWVYzXA3=mZKzRDa_OKnLF7Mg@mail.gmail.com
|
|
The new name has been preferred by Autoconf for a long time. Future
versions of Autoconf will warn about the old name.
Discussion: https://www.postgresql.org/message-id/flat/e796c185-5ece-8569-248f-dd3799701be1%402ndquadrant.com
|
|
When wal_level=logical, write invalidations at command end into WAL so
that decoding can use this information.
This patch is required to allow the streaming of in-progress transactions
in logical decoding. The actual work to allow streaming will be committed
as a separate patch.
We still add the invalidations to the cache and write them to WAL at
commit time in RecordTransactionCommit(). This uses the existing
XLOG_INVALIDATIONS xlog record type, from the RM_STANDBY_ID resource
manager (see LogStandbyInvalidations for details).
So existing code relying on those invalidations (e.g. redo) does not need
to be changed.
The invalidations written at command end uses a new xlog record type
XLOG_XACT_INVALIDATIONS, from RM_XACT_ID resource manager. See
LogLogicalInvalidations for details.
These new xlog records are ignored by existing redo procedures, which
still rely on the invalidations written to commit records.
The invalidations are decoded and accumulated in top-transaction, and then
executed during replay. This obviates the need to decode the
invalidations as part of a commit record.
Bump XLOG_PAGE_MAGIC, since this introduces XLOG_XACT_INVALIDATIONS.
Author: Dilip Kumar, Tomas Vondra, Amit Kapila
Reviewed-by: Amit Kapila
Tested-by: Neha Sharma and Mahendra Singh Thalor
Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
|
|
Add infinities that behave the same as they do in the floating-point
data types. Aside from any intrinsic usefulness these may have,
this closes an important gap in our ability to convert floating
values to numeric and/or replace float-based APIs with numeric.
The new values are represented by bit patterns that were formerly
not used (although old code probably would take them for NaNs).
So there shouldn't be any pg_upgrade hazard.
Patch by me, reviewed by Dean Rasheed and Andrew Gierth
Discussion: https://postgr.es/m/606717.1591924582@sss.pgh.pa.us
|
|
An incorrect reference to SHA-1 was present.
Author: Daniel Gustafsson
Discussion: https://postgr.es/m/FE26C953-FA87-4BB9-9105-AA1F8705B0D0@yesql.se
|
|
Holding just a buffer pin (with no buffer lock) on an nbtree buffer/page
provides very weak guarantees, especially compared to heapam, where it's
often safe to read a page while only holding a buffer pin. This commit
has Valgrind enforce the following rule: it is never okay to access an
nbtree buffer without holding both a pin and a lock on the buffer.
A draft version of this patch detected questionable code that was
cleaned up by commits fa7ff642 and 7154aa16. The code in question used
to access an nbtree buffer page's special/opaque area with no buffer
lock (only a buffer pin). This practice (which isn't obviously unsafe)
is hereby formally disallowed in nbtree. There doesn't seem to be any
reason to allow it, and banning it keeps things simple for Valgrind.
The new checks are implemented by adding custom nbtree client requests
(located in LockBuffer() wrapper functions); these requests are
"superimposed" on top of the generic bufmgr.c Valgrind client requests
added by commit 1e0dfd16. No custom resource management cleanup code is
needed to undo the effects of marking buffers as non-accessible under
this scheme.
Author: Peter Geoghegan
Reviewed-By: Anastasia Lubennikova, Georgios Kokolatos
Discussion: https://postgr.es/m/CAH2-WzkLgyN3zBvRZ1pkNJThC=xi_0gpWRUb_45eexLH1+k2_Q@mail.gmail.com
|
|
The bug fixed in commit 72eab84a5 would not have occurred if initdb
had a less surprising rule about which columns should be marked
NOT NULL by default. Let's make that rule be strictly that the
column must be fixed-width and its predecessors must be fixed-width
and NOT NULL, removing the hacky and unsafe exceptions for oidvector
and int2vector.
Since we do still want all existing oidvector and int2vector columns
to be marked NOT NULL, we have to put BKI_FORCE_NOT_NULL labels on
them. But making this less magic and more documented seems like a
good idea, even if it's a shade more verbose.
I didn't bump catversion since the initial catalog contents are
not actually changed by this patch. Note however that the
contents of postgres.bki do change, and feeding an old copy of
that to a new backend will produce wrong results.
Discussion: https://postgr.es/m/204760.1595181800@sss.pgh.pa.us
|
|
This is more consistent with the IEEE 754 spec and our treatment of
NaNs elsewhere; in particular, the case has always acted that way in
"numeric" arithmetic.
Noted by Dean Rasheed.
Discussion: https://postgr.es/m/3421746.1594927785@sss.pgh.pa.us
|
|
The code has always set this column to NULL when it's not valid,
but the catalog header's description failed to reflect that,
as did the SGML docs, as did some of the code. To prevent future
coding errors of the same ilk, let's hide the field from C code
as though it were variable-length (which, in a sense, it is).
As with commit 72eab84a5, we can only fix this cleanly in HEAD
and v13; the problem extends further back but we'll need some
klugery in the released branches.
Discussion: https://postgr.es/m/367660.1595202498@sss.pgh.pa.us
|
|
Commit b9c130a1f failed to apply the publisher-to-subscriber column
mapping while checking which columns were updated. Perhaps less
significantly, it didn't exclude dropped columns either. This could
result in an incorrect updated-columns bitmap and thus wrong decisions
about whether to fire column-specific triggers on the subscriber while
applying updates. In HEAD (since commit 9de77b545), it could also
result in accesses off the end of the colstatus array, as detected by
buildfarm member skink. Fix the logic, and adjust 003_constraints.pl
so that the problem is exposed in unpatched code.
In HEAD, also add some assertions to check that we don't access off
the ends of these newly variable-sized arrays.
Back-patch to v10, as b9c130a1f was.
Discussion: https://postgr.es/m/CAH2-Wz=79hKQ4++c5A060RYbjTHgiYTHz=fw6mptCtgghH2gJA@mail.gmail.com
|
|
max_slot_wal_keep_size that was added in v13 and wal_keep_segments are
the GUC parameters to specify how much WAL files to retain for
the standby servers. While max_slot_wal_keep_size accepts the number of
bytes of WAL files, wal_keep_segments accepts the number of WAL files.
This difference of setting units between those similar parameters could
be confusing to users.
To alleviate this situation, this commit renames wal_keep_segments to
wal_keep_size, and make users specify the WAL size in it instead of
the number of WAL files.
There was also the idea to rename max_slot_wal_keep_size to
max_slot_wal_keep_segments, in the discussion. But we have been moving
away from measuring in segments, for example, checkpoint_segments was
replaced by max_wal_size. So we concluded to rename wal_keep_segments
to wal_keep_size.
Back-patch to v13 where max_slot_wal_keep_size was added.
Author: Fujii Masao
Reviewed-by: Álvaro Herrera, Kyotaro Horiguchi, David Steele
Discussion: https://postgr.es/m/574b4ea3-e0f9-b175-ead2-ebea7faea855@oss.nttdata.com
|
|
The logical decoding infrastructure needs to know which top-level
transaction the subxact belongs to, in order to decode all the
changes. Until now that might be delayed until commit, due to the
caching (GPROC_MAX_CACHED_SUBXIDS), preventing features requiring
incremental decoding.
So we also write the assignment info into WAL immediately, as part
of the next WAL record (to minimize overhead) only when wal_level=logical.
We can not remove the existing XLOG_XACT_ASSIGNMENT WAL as that is
required for avoiding overflow in the hot standby snapshot.
Bump XLOG_PAGE_MAGIC, since this introduces XLR_BLOCK_ID_TOPLEVEL_XID.
Author: Tomas Vondra, Dilip Kumar, Amit Kapila
Reviewed-by: Amit Kapila
Tested-by: Neha Sharma and Mahendra Singh Thalor
Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
|
|
There was no easy way to find how many times generic and custom plans
have been executed for a prepared statement. This commit exposes those
numbers of times in pg_prepared_statements view.
Author: Atsushi Torikoshi, Kyotaro Horiguchi
Reviewed-by: Tatsuro Yamada, Masahiro Ikeda, Fujii Masao
Discussion: https://postgr.es/m/CACZ0uYHZ4M=NZpofH6JuPHeX=__5xcDELF8hT8_2T+R55w4RQw@mail.gmail.com
|
|
Due to the layout of this catalog, subslotname has to be explicitly
marked BKI_FORCE_NULL, else initdb will default to the assumption
that it's non-nullable. Since, in fact, CREATE/ALTER SUBSCRIPTION
will store null values there, the existing marking is just wrong,
and has been since this catalog was invented.
We haven't noticed because not much in the system actually depends
on attnotnull being truthful. However, JIT'ed tuple deconstruction
does depend on that in some cases, allowing crashes or wrong answers
in queries that inspect pg_subscription. Commit 9de77b545 quite
accidentally exposed this on the buildfarm members that force JIT
activation.
Back-patch to v13. The problem goes further back, but we cannot
force initdb in released branches, so some klugier solution will
be needed there. Before working on that, push this simple fix
to try to get the buildfarm back to green.
Discussion: https://postgr.es/m/4118109.1595096139@sss.pgh.pa.us
|
|
This avoids deprecation warnings from newer OpenSSL versions (3.0.0 in
particular).
Discussion: https://www.postgresql.org/message-id/flat/FEF81714-D479-4512-839B-C769D2605F8A%40yesql.se
|
|
This header hasn't changed recently, so the fact that it now fails
headerscheck/cpluspluscheck testing must be due to changes in what
it includes. Probably f21916791 is to blame, but I didn't try to
verify that.
Discussion: https://postgr.es/m/3699703.1595016554@sss.pgh.pa.us
|
|
This patch adds a "binary" option to CREATE/ALTER SUBSCRIPTION.
When that's set, the publisher will send data using the data type's
typsend function if any, rather than typoutput. This is generally
faster, if slightly less robust.
As committed, we won't try to transfer user-defined array or composite
types in binary, for fear that type OIDs won't match at the subscriber.
This might be changed later, but it seems like fit material for a
follow-on patch.
Dave Cramer, reviewed by Daniel Gustafsson, Petr Jelinek, and others;
adjusted some by me
Discussion: https://postgr.es/m/CADK3HH+R3xMn=8t3Ct+uD+qJ1KD=Hbif5NFMJ+d5DkoCzp6Vgw@mail.gmail.com
|
|
Teach Valgrind memcheck to maintain the "defined-ness" of each shared
buffer based on whether the backend holds at least one pin at the point
it is accessed by access method code. Bugs like the one fixed by commit
b0229f26 can be detected using this new instrumentation.
Note that backends running with Valgrind naturally have their own
independent ideas about whether any given byte in shared memory is safe
or unsafe to access. There is no risk that concurrent access by
multiple backends to the same shared memory will confuse Valgrind's
instrumentation, because everything already works at the process level
(or at the memory mapping level, if you prefer).
Author: Álvaro Herrera, Peter Geoghegan
Reviewed-By: Anastasia Lubennikova
Discussion: https://postgr.es/m/20150723195349.GW5596@postgresql.org
Discussion: https://postgr.es/m/CAH2-WzkLgyN3zBvRZ1pkNJThC=xi_0gpWRUb_45eexLH1+k2_Q@mail.gmail.com
|
|
This representation saves 8 bytes per tuple compared to HeapTuple, and
avoids the need to allocate, copy and free on the receiving side.
Gather can emit the returned MinimalTuple directly, but GatherMerge now
needs to make an explicit copy because it buffers multiple tuples at a
time. That should be no worse than before.
Reviewed-by: Soumyadeep Chakraborty <soumyadeep2007@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKG%2B8T_ggoUTAE-U%3DA%2BOcPc4%3DB0nPPHcSfffuQhvXXjML6w%40mail.gmail.com
|
|
This allows the huge page size to be set explicitly. The default is 0,
meaning it will use the system default, as before.
Author: Odin Ugedal <odin@ugedal.com>
Discussion: https://postgr.es/m/20200608154639.20254-1-odin%40ugedal.com
|
|
When using the following functions, users could see various types of
errors of the type "cache lookup failed for OID XXX" with elog(), that
can only be used for internal errors:
* pg_describe_object()
* pg_identify_object()
* pg_identify_object_as_address()
The set of APIs managing object addresses for all object types are made
smarter by gaining a new argument "missing_ok" that allows any caller to
control if an error is raised or not on an undefined object. The SQL
functions listed above are changed to handle the case where an object is
missing.
Regression tests are added for all object types for the cases where
these are undefined. Before this commit, these cases failed with cache
lookup errors, and now they basically return NULL (minus the name of the
object type requested).
Author: Michael Paquier
Reviewed-by: Aleksander Alekseev, Dmitry Dolgov, Daniel Gustafsson,
Álvaro Herrera, Kyotaro Horiguchi
Discussion: https://postgr.es/m/CAB7nPqSZxrSmdHK-rny7z8mi=EAFXJ5J-0RbzDw6aus=wB5azQ@mail.gmail.com
|
|
Three groups of issues needed to be addressed:
load_external_function() and related functions returned PGFunction,
even though not necessarily all callers are looking for a function of
type PGFunction. Since these functions are really just wrappers
around dlsym(), change to return void * just like dlsym().
In dynahash.c, we are using strlcpy() where a function with a
signature like memcpy() is expected. This should be safe, as the new
comment there explains, but the cast needs to be augmented to avoid
the warning.
In PL/Python, methods all need to be cast to PyCFunction, per Python
API, but this now runs afoul of these warnings. (This issue also
exists in core CPython.)
To fix the second and third case, we add a new type pg_funcptr_t that
is defined specifically so that gcc accepts it as a special function
pointer that can be cast to any other function pointer without the
warning.
Also add -Wcast-function-type to the standard warning flags, subject
to configure check.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/1e97628e-6447-b4fd-e230-d109cec2d584%402ndquadrant.com
|
|
The following header files missed the shot:
- jsonfuncs.h, as of ce0425b.
- jsonapi.h, as of beb4699.
- llvmjit_emit.h as of 7ec0d80.
- partdesc.h, as of 1bb5e78.
Author: Jesse Zhang
Discussion: https://postgr.es/m/CAGf+fX4-8xULEOz09DE2dZGjT+q8VJ--rqfTpvcFwc+A4fc-3Q@mail.gmail.com
|
|
Incorrect function names were referenced. As this fixes some portions
of tableam.h, that is mentioned in the docs as something to look at when
implementing a table AM, backpatch down to 12 where this has been
introduced.
Author: Hironobu Suzuki
Discussion: https://postgr.es/m/8fe6d672-28dd-3f1d-7aed-ac2f6d599d3f@interdb.jp
Backpatch-through: 12
|
|
This is a replacement for 4cad2534. Instead of projecting all tuples
going into a HashAgg, only remove unnecessary attributes when actually
spilling. This avoids the regression for the in-memory case.
Discussion: https://postgr.es/m/a2fb7dfeb4f50aa0a123e42151ee3013933cb802.camel%40j-davis.com
Backpatch-through: 13
|
|
The stats with this commit was available only for WALSenders, however,
users might want to see for backends doing logical decoding via SQL API.
Then, users might want to reset and access these stats across server
restart which was not possible with the current patch.
List of commits reverted:
caa3c4242c Don't call elog() while holding spinlock.
e641b2a995 Doc: Update the documentation for spilled transaction
statistics.
5883f5fe27 Fix unportable printf format introduced in commit 9290ad198.
9290ad198b Track statistics for spilling of changes from ReorderBuffer.
Additionaly, remove the release notes entry for this feature.
Backpatch-through: 13, where it was introduced
Discussion: https://postgr.es/m/CA+fd4k5_pPAYRTDrO2PbtTOe0eHQpBvuqmCr8ic39uTNmR49Eg@mail.gmail.com
|
|
This includes two changes:
- Addition of a new function pg_xact_commit_timestamp_origin() able, for
a given transaction ID, to return the commit timestamp and replication
origin of this transaction. An equivalent function existed in
pglogical.
- Addition of the replication origin to pg_last_committed_xact().
The commit timestamp manager includes already APIs able to return the
replication origin of a transaction on top of its commit timestamp, but
the code paths for replication origins were never stressed as those
functions have never looked for a replication origin, and the SQL
functions available have never included this information since their
introduction in 73c986a.
While on it, refactor a test of modules/commit_ts/ to use tstzrange() to
check that a transaction timestamp is within the wanted range, making
the test a bit easier to read.
Bump catalog version.
Author: Movead Li
Reviewed-by: Madan Kumar, Michael Paquier
Discussion: https://postgr.es/m/2020051116430836450630@highgo.ca
|
|
"relkind" normally refers to the char field from pg_class. However, in
the parse nodes AlterTableStmt and CreateTableAsStmt, "relkind" was used
for a field of type enum ObjectType, that could refer to other object
types than those possible for a relkind. Such fields being usually
named "objtype", switch the name in both structures to make things more
consistent. Note that this led to some confusion in functions that
also operate on a RangeTableEntry object, which also has a field named
"relkind".
This naming goes back to commit 09d4e96, where only OBJECT_TABLE and
OBJECT_INDEX were used. This got extended later to use as well
OBJECT_TYPE with e440e12, not really a relation kind.
Author: Mark Dilger
Reviewed-by: Daniel Gustafsson, Álvaro Herrera, Michael Paquier
Discussion: https://postgr.es/m/609181AE-E399-47C7-9221-856E0F96BF93@enterprisedb.com
|