Age | Commit message (Collapse) | Author |
|
The DROP STATISTICS code failed to properly lock the table, leading to
ERROR: tuple concurrently deleted
when executed concurrently with ANALYZE.
Fixed by modifying RemoveStatisticsById() to acquire the same lock as
ANALYZE. This function is called only by DROP STATISTICS, as ANALYZE
calls RemoveStatisticsDataById() directly.
Reported by Justin Pryzby, fix by me. Backpatch through 12. The code was
like this since it was introduced in 10, but older releases are EOL.
Reported-by: Justin Pryzby
Reviewed-by: Tom Lane
Backpatch-through: 12
Discussion: https://postgr.es/m/ZUuk-8CfbYeq6g_u@pryzbyj2023
|
|
contain_mutable_functions and contain_volatile_functions give
reliable answers only after expression preprocessing (specifically
eval_const_expressions). Some places understand this, but some did
not get the memo --- which is not entirely their fault, because the
problem is documented only in places far away from those functions.
Introduce wrapper functions that allow doing the right thing easily,
and add commentary in hopes of preventing future mistakes from
copy-and-paste of code that's only conditionally safe.
Two actual bugs of this ilk are fixed here. We failed to preprocess
column GENERATED expressions before checking mutability, so that the
code could fail to detect the use of a volatile function
default-argument expression, or it could reject a polymorphic function
that is actually immutable on the datatype of interest. Likewise,
column DEFAULT expressions weren't preprocessed before determining if
it's safe to apply the attmissingval mechanism. A false negative
would just result in an unnecessary table rewrite, but a false
positive could allow the attmissingval mechanism to be used in a case
where it should not be, resulting in unexpected initial values in a
new column.
In passing, re-order the steps in ComputePartitionAttrs so that its
checks for invalid column references are done before applying
expression_planner, rather than after. The previous coding would
not complain if a partition expression contains a disallowed column
reference that gets optimized away by constant folding, which seems
to me to be a behavior we do not want.
Per bug #18097 from Jim Keener. Back-patch to all supported versions.
Discussion: https://postgr.es/m/18097-ebb179674f22932f@postgresql.org
|
|
It's undesirable to have SQL commands or configuration options in a
translatable error string, so take some of these out.
|
|
Dropping a temp table could entail TOAST table access to clean out
toasted catalog entries, such as large pg_constraint.conbin strings
for complex CHECK constraints. If we did that via ON COMMIT DROP,
we triggered the assertion in init_toast_snapshot(), because
there was no provision for setting up a snapshot for the drop
actions. Fix that.
(I assume here that the adjacent truncation actions for ON COMMIT
DELETE ROWS don't have a similar problem: it doesn't seem like
nontransactional truncations would need to touch any toasted fields.
If that proves wrong, we could refactor a bit to have the same
snapshot acquisition cover that too.)
The test case added here does not fail before v15, because that
assertion was added in 277692220 which was not back-patched.
However, the race condition the assertion warns of surely
exists further back, so back-patch to all supported branches.
Per report from Richard Guo.
Discussion: https://postgr.es/m/CAMbWs4-x26=_QxxgdJyNbiCDzvtr2WV5ZDso_v-CukKEe6cBZw@mail.gmail.com
|
|
In recent releases, such cases fail with "cache lookup failed for
function 0" rather than complaining that the conversion function
doesn't exist as prior versions did. Seems to be a consequence of
sloppy refactoring in commit f82de5c46. Add the missing error check.
Per report from Pierre Fortin. Back-patch to v14 where the
oversight crept in.
Discussion: https://postgr.es/m/20230929163739.3bea46e5.pfortin@pfortin.com
|
|
Commit 9f8377f7a2 was a little too eager in fetching default values.
Normally this would not matter, but if the default value is not valid
for the type (e.g. a varchar that's too long) it caused an unnecessary
error.
Complaint and fix from Laurenz Albe
Backpatch to release 16.
Discussion: https://postgr.es/m/75a7b7483aeb331aa017328d606d568fc715b90d.camel@cybertec.at
|
|
ANALYZE on a table with inheritance children analyzes all the child
tables in a loop. When stepping to next child table, it updated the
child rel ID value in the command progress stats, but did not reset
the 'sample_blks_total' and 'sample_blks_scanned' counters.
acquire_sample_rows() updates 'sample_blks_total' as soon as the scan
starts and 'sample_blks_scanned' after processing the first block, but
until then, pg_stat_progress_analyze would display a bogus combination
of the new child table relid with old counter values from the
previously processed child table. Fix by resetting 'sample_blks_total'
and 'sample_blks_scanned' to zero at the same time that
'current_child_table_relid' is updated.
Backpatch to v13, where pg_stat_progress_analyze view was introduced.
Reported-by: Justin Pryzby
Discussion: https://www.postgresql.org/message-id/20230122162345.GP13860%40telsasoft.com
|
|
Reported-by: Jeff Davis
Author: Hou Zhijie
Reviewed-by: Amit Kapila
Backpatch-through: 16
Discussion: http://postgr.es/m/17b62714fd115bd1899afd922954540a5c6a0467.camel@j-davis.com
|
|
Commit 31966b15 invented a way for functions dealing with relation
extension to accept a Relation in online code and an SMgrRelation in
recovery code. It seems highly likely that future bufmgr.c interfaces
will face the same problem, and need to do something similar.
Generalize the names so that each interface doesn't have to re-invent
the wheel.
Back-patch to 16. Since extension AM authors might start using the
constructor macros once 16 ships, we agreed to do the rename in 16
rather than waiting for 17.
Reviewed-by: Peter Geoghegan <pg@bowt.ie>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKG%2B6tLD2BhpRWycEoti6LVLyQq457UL4ticP5xd8LqHySA%40mail.gmail.com
|
|
Discussion: https://www.postgresql.org/message-id/CAHut+PtfzQ7JRkb0-Y_UejAxaLQ17-bGMvV4MJJHcPoP3ML2bg@mail.gmail.com
|
|
Substituting such values in extension scripts facilitated SQL injection
when @extowner@, @extschema@, or @extschema:...@ appeared inside a
quoting construct (dollar quoting, '', or ""). No bundled extension was
vulnerable. Vulnerable uses do appear in a documentation example and in
non-bundled extensions. Hence, the attack prerequisite was an
administrator having installed files of a vulnerable, trusted,
non-bundled extension. Subject to that prerequisite, this enabled an
attacker having database-level CREATE privilege to execute arbitrary
code as the bootstrap superuser. By blocking this attack in the core
server, there's no need to modify individual extensions. Back-patch to
v11 (all supported versions).
Reported by Micah Gate, Valerie Woolard, Tim Carey-Smith, and Christoph
Berg.
Security: CVE-2023-39417
|
|
9f8377f7a added code to allow COPY FROM insert a column's default value
when the input matches the DEFAULT string specified in the COPY command.
Here we fix some inefficient code which needlessly palloc0'd an array to
store if we should use the default value or input value for the given
column. This array was being palloc0'd and pfree'd once per row. It's
much more efficient to allocate this once and just reset the values once
per row.
Reported-by: Masahiko Sawada
Author: Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoDvDmUQeJtZrau1ovnT_smN940%3DKp6mszNGK3bq9yRN6g%40mail.gmail.com
Backpatch-through: 16, where 9f8377f7a was introduced.
|
|
indisvalid is switched to true for partitioned indexes when all its
partitions have valid indexes when attaching a new partition, up to the
top-most parent if all its leaves are themselves valid when dealing with
multiple layers of partitions.
The copy of the tuple from pg_index used to switch indisvalid to true
came from the relation cache, which is incorrect. Particularly, in the
case reported by Shruthi Gowda, executing a series of commands in a
single transaction would cause the validation of partitioned indexes to
use an incorrect version of a pg_index tuple, as indexes are reloaded
after an invalidation request with RelationReloadIndexInfo(), a much
faster version than a full index cache rebuild. In this case, the
limited information updated in the cache leads to an incorrect version
of the tuple used. One of the symptoms reported was the following
error, with a replica identity update, for instance:
"ERROR: attempted to update invisible tuple"
This is incorrect since 8b08f7d, so backpatch all the way down.
Reported-by: Shruthi Gowda
Author: Michael Paquier
Reviewed-by: Shruthi Gowda, Dilip Kumar
Discussion: https://postgr.es/m/CAASxf_PBcxax0wW-3gErUyftZ0XrCs3Lrpuhq4-Z3Fak1DoW7Q@mail.gmail.com
Backpatch-through: 11
|
|
Until now, when DROP DATABASE got interrupted in the wrong moment, the removal
of the pg_database row would also roll back, even though some irreversible
steps have already been taken. E.g. DropDatabaseBuffers() might have thrown
out dirty buffers, or files could have been unlinked. But we continued to
allow connections to such a corrupted database.
To fix this, mark databases invalid with an in-place update, just before
starting to perform irreversible steps. As we can't add a new column in the
back branches, we use pg_database.datconnlimit = -2 for this purpose.
An invalid database cannot be connected to anymore, but can still be
dropped.
Unfortunately we can't easily add output to psql's \l to indicate that some
database is invalid, it doesn't fit in any of the existing columns.
Add tests verifying that a interrupted DROP DATABASE is handled correctly in
the backend and in various tools.
Reported-by: Evgeny Morozov <postgresql3@realityexists.net>
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20230509004637.cgvmfwrbht7xm7p6@awork3.anarazel.de
Discussion: https://postgr.es/m/20230314174521.74jl6ffqsee5mtug@awork3.anarazel.de
Backpatch: 11-, bug present in all supported versions
|
|
When vac_truncate_clog() encounters bogus datfrozenxid / datminmxid values, it
returns early. Unfortunately, until now, it did not release
WrapLimitsVacuumLock. If the backend later tries to acquire
WrapLimitsVacuumLock, the session / autovacuum worker hangs in an
uncancellable way. Similarly, other sessions will hang waiting for the
lock. However, if the backend holding the lock exited or errored out for some
reason, the lock was released.
The bug was introduced as a side effect of 566372b3d643.
It is interesting that there are no production reports of this problem. That
is likely due to a mix of bugs leading to bogus values having gotten less
common, process exit releasing locks and instances of hangs being hard to
debug for "normal" users.
Discussion: https://postgr.es/m/20230621221208.vhsqgduwfpzwxnpg@awork3.anarazel.de
|
|
|
|
As coded, the code would use as a base comparison the namespace OID from
the first object scanned in pg_depend when switching its namespace
dependency entry to the new one, and use it as a base of comparison for
any follow-up checks. It would also be used as the old namespace OID to
switch *from* for the extension's pg_depend entry. Hence, if the first
object scanned has a namespace different than the one stored in the
extension, we would finish by:
- Not checking that the extension objects map with the extension's
schema.
- Not switching the extension -> namespace dependency entry to the new
namespace provided by the user, making ALTER EXTENSION ineffective.
This issue exists since this command has been introduced in d9572c4 for
relocatable extension, so backpatch all the way down to 11. The test
case has been provided by Heikki, that I have tweaked a bit to show the
effects on pg_depend for the extension.
Reported-by: Heikki Linnakangas
Author: Michael Paquier, Heikki Linnakangas
Discussion: https://postgr.es/m/20eea594-a05b-4c31-491b-007b6fceef28@iki.fi
Backpatch-through: 11
|
|
This reverts the following commits: 4dbdb82513, c2122aae63,
5b1a879943, 9e1e9d6560, ff9618e82a, 60684dd834, 4441fc704d,
and b5d6382496. A role with the MAINTAIN privilege may be able to
use search_path tricks to escalate privileges to the table owner.
Unfortunately, it is too late in the v16 development cycle to apply
the proposed fix, i.e., restricting search_path when running
maintenance commands.
Bumps catversion.
Reviewed-by: Jeff Davis
Discussion: https://postgr.es/m/E1q7j7Y-000z1H-Hr%40gemulon.postgresql.org
Backpatch-through: 16
|
|
Commit 7b64e4b3 taught DropSubscription() to drop stats entry of
subscription that is not associated with a replication slot for apply
worker at DROP SUBSCRIPTION but missed covering the case where the
subscription is not associated with replication slots for both apply
worker and tablesync worker.
Also add a test to verify that the stats for slot-less subscription is
removed at DROP SUBSCRIPTION time.
Backpatch down to 15.
Author: Masahiko Sawada
Reviewed-by: Nathan Bossart, Hayato Kuroda, Melih Mutlu, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoB71zkP7uPT7JDPsZcvp0749ExEQnOJxeNKPDFisHar+w@mail.gmail.com
Backpatch-through: 15
|
|
Commit f24523672d fixed a memory leak by moving the modifiedCols bitmap
into the per-row memory context. In the case of AFTER UPDATE triggers,
the bitmap is however referenced from an event kept until the end of the
query, resulting in a use-after-free bug.
Fixed by copying the bitmap into the AfterTriggerEvents memory context,
which is the one where we keep the trigger events. There's only one
place that needs to do the copy, but the memory context may not exist
yet. Doing that in a separate function seems more readable.
Report by Alexander Pyhalov, fix by me. Backpatch to 13, where the
bitmap was added to the event by commit 71d60e2aa0.
Reported-by: Alexander Pyhalov
Backpatch-through: 13
Discussion: https://postgr.es/m/acddb17c89b0d6cb940eaeda18c08bbe@postgrespro.ru
|
|
The logic that introduced partitioned indexes missed a few things when
invalidating a partitioned index when these are created, still the code
is written to handle recursions:
1) If created from scratch because a mapping index could not be found,
the new index created could be itself invalid, if for example it was a
partitioned index with one of its leaves invalid.
2) A CCI was missing when indisvalid is set for a parent index, leading
to inconsistent trees when recursing across more than one level for a
partitioned index creation if an invalidation of the parent was
required.
This could lead to the creation of a partition index tree where some of
the partitioned indexes are marked as invalid, but some of the parents
are marked valid, which is not something that should happen (as
validatePartitionedIndex() defines, indisvalid is switched to true for a
partitioned index iff all its partitions are themselves valid).
This patch makes sure that indisvalid is set to false on a partitioned
index if at least one of its partition is invalid. The flag is set to
true if *all* its partitions are valid.
The regression test added in this commit abuses of a failed concurrent
index creation, marked as invalid, that maps with an index created on
its partitioned table afterwards.
Reported-by: Alexander Lakhin
Reviewed-by: Alexander Lakhin
Discussion: https://postgr.es/m/14987634-43c0-0cb3-e075-94d423607e08@gmail.com
Backpatch-through: 11
|
|
ALTER TABLE .. SET ACCESS METHOD was not registering a dependency to the
new access method with the relation altered in its rewrite phase, making
possible the drop of an access method even if there are relations that
depend on it. During the rewrite, a temporary relation is created to
build the new relation files before swapping the new and old files, and,
while the temporary relation was registering a correct dependency to the
new AM, the old relation did not do that. A dependency on the access
method is added when the relation files are swapped, which is the point
where pg_class is updated.
Materialized views and tables use the same code path, hence both were
impacted.
Backpatch down to 15, where this command has been introduced.
Reported-by: Alexander Lakhin
Reviewed-by: Nathan Bossart, Andres Freund
Discussion: https://postgr.es/m/18000-9145c25b1af475ca@postgresql.org
Backpatch-through: 15
|
|
|
|
PARTITION
A portion of ALTER TABLE .. ATTACH PARTITION is to ensure that the
partition being attached to the partitioned table has a correct set of
indexes, so as there is a consistent index mapping between the
partitioned table and its new-to-be partition. However, as introduced
in 8b08f7d, the current logic could choose an invalid index as a match,
which is something that can exist when dealing with more than two levels
of partitioning, like attaching a partitioned table (that has
partitions, with an index created by CREATE INDEX ON ONLY) to another
partitioned table.
A partitioned index with indisvalid set to false is equivalent to an
incomplete partition tree, meaning that an invalid partitioned index
does not have indexes defined in all its partitions. Hence, choosing an
invalid partitioned index can create inconsistent partition index trees,
where the parent attaching to is valid, but its partition may be
invalid.
In the report from Alexander Lakhin, this showed up as an assertion
failure when validating an index. Without assertions enabled, the
partition index tree would be actually broken, as indisvalid should
be switched to true for a partitioned index once all its partitions are
themselves valid. With two levels of partitioning, the top partitioned
table used a valid index and was able to link to an invalid index stored
on its partition, itself a partitioned table.
I have studied a few options here (like the possibility to switch
indisvalid to false for the parent), but came down to the conclusion
that we'd better rely on a simple rule: invalid indexes had better never
be chosen, so as the partition attached uses and creates indexes that
the parent expects. Some regression tests are added to provide some
coverage. Note that the existing coverage is not impacted.
This is a problem since partitioned indexes exist, so backpatch all the
way down to v11.
Reported-by: Alexander Lakhin
Discussion: https://postgr.es/14987634-43c0-0cb3-e075-94d423607e08@gmail.com
Backpatch-through: 11
|
|
Since 3db72eb, the query ID of utilities is generated using the Query
structure, making the use of the query string in JumbleQuery()
unnecessary. This commit removes the argument "querytext" from
JumbleQuery().
Reported-by: Joe Conway
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/ZJlQAWE4COFqHuAV@paquier.xyz
|
|
ff9618e82a introduced has_partition_ancestor_privs(), which is used
to check whether a user has MAINTAIN on any partition ancestors.
This involves syscache lookups, and presently this function does
not take any relation locks, so it is likely subject to the same
kind of cache lookup failures that were fixed by 19de0ab23c.
To fix this problem, this commit partially reverts ff9618e82a.
Specifically, it removes the partition-related changes, including
the has_partition_ancestor_privs() function mentioned above. This
means that MAINTAIN on a partitioned table is no longer sufficient
to perform maintenance commands on its partitions. This is more
like how privileges for maintenance commands work on supported
versions. Privileges are checked for each partition, so a command
that flows down to all partitions might refuse to process them
(e.g., if the current user doesn't have MAINTAIN on the partition).
In passing, adjust a few related comments and error messages, and
add a test for the privilege checks for CLUSTER on a partitioned
table.
Reviewed-by: Michael Paquier, Jeff Davis
Discussion: https://postgr.es/m/20230613211246.GA219055%40nathanxps13
|
|
The existing errhint message and docs were missing the fact that we can't
disassociate from the slot unless the subscription is disabled.
Author: Robert Sjöblom, Peter Smith
Reviewed-by: Peter Eisentraut, Amit Kapila
Backpatch-through: 11
Discussion: https://postgr.es/m/807bdf85-61ea-88e2-5712-6d9fcd4eabff@fortnox.se
|
|
ff9618e82a introduced the skip_privs parameter, which is used to
skip privilege checks when recursing to a relation's TOAST table.
This parameter should have been added as a flag bit in
VacuumParams->options instead.
Suggested-by: Michael Paquier
Reviewed-by: Michael Paquier, Jeff Davis
Discussion: https://postgr.es/m/ZIj4v1CwqlDVJZfB%40paquier.xyz
|
|
For CREATE DATABASE, make LOCALE parameter apply regardless of the
provider used. Also affects initdb and createdb --locale arguments.
Previously, LOCALE (and --locale) only affected the database default
collation when using the libc provider.
Discussion: https://postgr.es/m/1a63084d-221e-4075-619e-6b3e590f673e@enterprisedb.com
Reviewed-by: Peter Eisentraut
|
|
This reverts commit 05e17373517114167d002494e004fa0aa32d1fd1.
|
|
While executing maintenance operations (ANALYZE, CLUSTER, REFRESH
MATERIALIZED VIEW, REINDEX, or VACUUM), set search_path to
'pg_catalog, pg_temp' to prevent inconsistent behavior.
Functions that are used for functional indexes, in index expressions,
or in materialized views and depend on a different search path must be
declared with CREATE FUNCTION ... SET search_path='...'.
This change addresses a security risk introduced in commit 60684dd834,
where a role with MAINTAIN privileges on a table may be able to
escalate privileges to the table owner. That commit is not yet part of
any release, so no need to backpatch.
Discussion: https://postgr.es/m/e44327179e5c9015c8dda67351c04da552066017.camel%40j-davis.com
Reviewed-by: Greg Stark
Reviewed-by: Nathan Bossart
|
|
Run pgindent, pgperltidy, and reformat-dat-files.
This set of diffs is a bit larger than typical. We've updated to
pg_bsd_indent 2.1.2, which properly indents variable declarations that
have multi-line initialization expressions (the continuation lines are
now indented one tab stop). We've also updated to perltidy version
20230309 and changed some of its settings, which reduces its desire to
add whitespace to lines to make assignments etc. line up. Going
forward, that should make for fewer random-seeming changes to existing
code.
Discussion: https://postgr.es/m/20230428092545.qfb3y5wcu4cm75ur@alvherre.pgsql
|
|
This reverts commit 096dd80f3ccc and its fixups beecbe8e5001, afdd9f7f0e00,
529da086ba, db93e739ac61.
Catversion is bumped.
Discussion: https://postgr.es/m/d46f9265-ff3c-6743-2278-6772598233c2%40pgmasters.net
|
|
c3afe8cf5a updated this error message, but it didn't use the new
style established in de4d456b40.
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/20230512203721.GA2644063%40nathanxps13.home
|
|
The two methods don't cooperate, so set_config_option("search_path",
...) has been ineffective under non-empty overrideStack. This defect
enabled an attacker having database-level CREATE privilege to execute
arbitrary code as the bootstrap superuser. While that particular attack
requires v13+ for the trusted extension attribute, other attacks are
feasible in all supported versions.
Standardize on the combination of NewGUCNestLevel() and
set_config_option("search_path", ...). It is newer than
PushOverrideSearchPath(), more-prevalent, and has no known
disadvantages. The "override" mechanism remains for now, for
compatibility with out-of-tree code. Users should update such code,
which likely suffers from the same sort of vulnerability closed here.
Back-patch to v11 (all supported versions).
Alexander Lakhin. Reported by Alexander Lakhin.
Security: CVE-2023-2454
|
|
The changes done in this commit impact comments with no direct
user-visible changes, with fixes for incorrect function, variable or
structure names.
Author: Alexander Lakhin
Discussion: https://postgr.es/m/e8c38840-596a-83d6-bd8d-cebc51111572@gmail.com
|
|
CREATE SCHEMA AUTHORIZATION with appended schema elements can lead to
crashes when comparing the schema name of the query with the schemas
used in the qualification of some clauses in the elements' queries.
The origin of the problem is that the transformation routine for the
elements listed in a CREATE SCHEMA query uses as new, expected, schema
name the one listed in CreateSchemaStmt itself. However, depending on
the query, CreateSchemaStmt.schemaname may be NULL, being computed
instead from the role specification of the query given by the
AUTHORIZATION clause, that could be either:
- A user name string, with the new schema name being set to the same
value as the role given.
- Guessed from CURRENT_ROLE, SESSION_ROLE or CURRENT_ROLE, with a new
schema name computed from the security context where CREATE SCHEMA is
running.
Regression tests are added for CREATE SCHEMA with some appended elements
(some of them with schema qualifications), covering also some role
specification patterns.
While on it, this simplifies the context structure used during the
transformation of the elements listed in a CREATE SCHEMA query by
removing the fields for the role specification and the role type. They
were not used, and for the role specification this could be confusing as
the schema name may by extracted from that at the beginning of
CreateSchemaCommand().
This issue exists for a long time, so backpatch down to all the versions
supported.
Reported-by: Song Hongyu
Author: Michael Paquier
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/17909-f65c12dfc5f0451d@postgresql.org
Backpatch-through: 11
|
|
This is not handled uniformly throughout the code, but at least nearby
code can be consistent.
|
|
Author: Alexander Lakhin
Discussion: https://postgr.es/m/699beab4-a6ca-92c9-f152-f559caf6dc25@gmail.com
|
|
Reported-by: Peter Eisentraut
Discussion: https://postgr.es/m/766f3799-0269-162f-ba63-4cae34a5534f@enterprisedb.com
|
|
This reverts commit e056c557aef4 and minor later fixes thereof.
There's a few problems in this new feature -- most notably regarding
pg_upgrade behavior, but others as well. This new feature is not in any
way critical on its own, so instead of scrambling to fix it we revert it
and try again in early 17 with these issues in mind.
Discussion: https://postgr.es/m/3801207.1681057430@sss.pgh.pa.us
|
|
There's no need to check if opt->arg is NULL since defGetString() already
does that and raises an ERROR if it is. Let's just remove that check.
Also, combine the two remaining ERRORs into a single check. It seems
better to give an indication about what sort of values we're looking for
rather than just to state that the value given isn't valid. Make
BUFFER_USAGE_LIMIT uppercase in this ERROR message too. It's already
upper case in one other error message, so make that consistent.
Reported-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20230411.102335.1643720544536884844.horikyota.ntt@gmail.com
|
|
We now create pg_constaint rows for NOT NULL constraints with
contype='n'.
We propagate these constraints during operations such as adding
inheritance relationships, creating and attaching partitions, creating
tables LIKE other tables. We mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations; for example, as opposed to CHECK constraints, we don't
match NOT NULL ones by name when descending a hierarchy to alter it;
instead we match by column number. This means we don't require the
constraint names to be identical across a hierarchy.
For now, we omit them from system catalogs. Maybe this is worth
reconsidering. We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)
This has been very long in the making. The first patch was written by
Bernd Helmle in 2010 to add a new pg_constraint.contype value ('n'),
which I (Álvaro) then hijacked in 2011 and 2012, until that one was
killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints. However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again.
In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring additional pg_attribute columns to
track the OID of the NOT NULL constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/CACA0E642A0267EDA387AF2B%40%5B172.26.14.62%5D
Discussion: https://postgr.es/m/AANLkTinLXMOEMz+0J29tf1POokKi4XDkWJ6-DDR9BKgU@mail.gmail.com
Discussion: https://postgr.es/m/20110707213401.GA27098@alvh.no-ip.org
Discussion: https://postgr.es/m/1343682669-sup-2532@alvh.no-ip.org
Discussion: https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Discussion: https://postgr.es/m/20220817181249.q7qvj3okywctra3c@alvherre.pgsql
|
|
1cbbee033 added BUFFER_USAGE_LIMIT to the VACUUM and ANALYZE commands, so
here we permit that option to be specified in vacuumdb.
In passing, adjust the documents for vacuum_buffer_usage_limit and the
BUFFER_USAGE_LIMIT VACUUM option to mention "kB" rather than "KB". Do the
same for the ERROR message in ExecVacuum() and
check_vacuum_buffer_usage_limit(). Without that we might tell a user that
the valid minimum value is 128 KB only to reject that because we accept
only "kB" and not "KB".
Also, add a small reminder comment in vacuum.h to try to trigger the
memory of anyone adding new fields to VacuumParams that they might want to
consider if vacuumdb needs to grow a new option too.
Author: Melanie Plageman
Reviewed-by: Justin Pryzby
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/ZAzTg3iEnubscvbf@telsasoft.com
|
|
Add new options to the VACUUM and ANALYZE commands called
BUFFER_USAGE_LIMIT to allow users more control over how large to make the
buffer access strategy that is used to limit the usage of buffers in
shared buffers. Larger rings can allow VACUUM to run more quickly but
have the drawback of VACUUM possibly evicting more buffers from shared
buffers that might be useful for other queries running on the database.
Here we also add a new GUC named vacuum_buffer_usage_limit which controls
how large to make the access strategy when it's not specified in the
VACUUM/ANALYZE command. This defaults to 256KB, which is the same size as
the access strategy was prior to this change. This setting also
controls how large to make the buffer access strategy for autovacuum.
Per idea by Andres Freund.
Author: Melanie Plageman
Reviewed-by: David Rowley
Reviewed-by: Andres Freund
Reviewed-by: Justin Pryzby
Reviewed-by: Bharath Rupireddy
Discussion: https://postgr.es/m/20230111182720.ejifsclfwymw2reb@awork3.anarazel.de
|
|
Allow autovacuum to reload the config file more often so that cost-based
delay parameters can take effect while VACUUMing a relation. Previously,
autovacuum workers only reloaded the config file once per relation
vacuumed, so config changes could not take effect until beginning to
vacuum the next table.
Now, check if a reload is pending roughly once per block, when checking
if we need to delay.
In order for autovacuum workers to safely update their own cost delay
and cost limit parameters without impacting performance, we had to
rethink when and how these values were accessed.
Previously, an autovacuum worker's wi_cost_limit was set only at the
beginning of vacuuming a table, after reloading the config file.
Therefore, at the time that autovac_balance_cost() was called, workers
vacuuming tables with no cost-related storage parameters could still
have different values for their wi_cost_limit_base and wi_cost_delay.
Now that the cost parameters can be updated while vacuuming a table,
workers will (within some margin of error) have no reason to have
different values for cost limit and cost delay (in the absence of
cost-related storage parameters). This removes the rationale for keeping
cost limit and cost delay in shared memory. Balancing the cost limit
requires only the number of active autovacuum workers vacuuming a table
with no cost-based storage parameters.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
|
|
Vacuum code run both by autovacuum workers and a backend doing
VACUUM/ANALYZE previously inspected VacuumCostLimit and VacuumCostDelay,
which are the global variables backing the GUCs vacuum_cost_limit and
vacuum_cost_delay.
Autovacuum workers needed to override these variables with their
own values, derived from autovacuum_vacuum_cost_limit and
autovacuum_vacuum_cost_delay and worker cost limit balancing logic.
This led to confusing code which, in some cases, both derived and
set a new value of VacuumCostLimit from VacuumCostLimit.
In preparation for refreshing these GUC values more often, introduce
new, independent global variables and add a function to update them
using the GUCs and existing logic.
Per suggestion by Kyotaro Horiguchi
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
|
|
While vacuuming a table in failsafe mode, VacuumCostActive should
not be re-enabled. This currently isn't a problem because vacuum
cost parameters are only refreshed in between vacuuming tables and
failsafe status is reset for every table.
In preparation for allowing vacuum cost parameters to be updated
more frequently, elevate LVRelState->failsafe_active to a global,
VacuumFailsafeActive, which will be checked when determining whether
or not to re-enable vacuum cost-related delays.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ZngzqnEODc7LmS1NH04Kt6Y9huSjz5pp7%2BDXhrjDA0gw%40mail.gmail.com
|
|
vacuum() is used for both the VACUUM command and for autovacuum. There
were many prechecks being done inside vacuum() that were just not relevant
to autovacuum. Let's move the bulk of these into ExecVacuum() so that
they're only executed when running the VACUUM command. This removes a
small amount of overhead when autovacuum vacuums a table.
While we are at it, allocate VACUUM's BufferAccessStrategy in ExecVacuum()
and pass it into vacuum() instead of expecting vacuum() to make it if it's
not already made by the calling function. To make this work, we need to
create the vacuum memory context slightly earlier, so we now need to pass
that down to vacuum() so that it's available for use in other memory
allocations.
Author: Melanie Plageman
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/20230405211534.4skgskbilnxqrmxg@awork3.anarazel.de
|
|
A few places are not converted. Some because they are tackled in later
commits (e.g. hio.c, xlogutils.c), some because they are more
complicated (e.g. brin_pageops.c). Having a few users of ReadBuffer(P_NEW) is
good anyway, to ensure the backward compat path stays working.
Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de
|