summaryrefslogtreecommitdiff
path: root/src/backend/utils/cache
AgeCommit message (Collapse)Author
2021-05-12Initial pgindent and pgperltidy run for v14.Tom Lane
Also "make reformat-dat-files". The only change worthy of note is that pgindent messed up the formatting of launcher.c's struct LogicalRepWorkerId, which led me to notice that that struct wasn't used at all anymore, so I just took it out.
2021-05-07Revert per-index collation version tracking feature.Thomas Munro
Design problems were discovered in the handling of composite types and record types that would cause some relevant versions not to be recorded. Misgivings were also expressed about the use of the pg_depend catalog for this purpose. We're out of time for this release so we'll revert and try again. Commits reverted: 1bf946bd: Doc: Document known problem with Windows collation versions. cf002008: Remove no-longer-relevant test case. ef387bed: Fix bogus collation-version-recording logic. 0fb0a050: Hide internal error for pg_collation_actual_version(<bad OID>). ff942057: Suppress "warning: variable 'collcollate' set but not used". d50e3b1f: Fix assertion in collation version lookup. f24b1569: Rethink extraction of collation dependencies. 257836a7: Track collation versions for indexes. cd6f479e: Add pg_depend.refobjversion. 7d1297df: Remove pg_collation.collversion. Discussion: https://postgr.es/m/CA%2BhUKGLhj5t1fcjqAu8iD9B3ixJtsTNqyCCD4V0aTO9kAKAjjA%40mail.gmail.com
2021-04-28Allow a partdesc-omitting-partitions to be cachedAlvaro Herrera
Makes partition descriptor acquisition faster during the transient period in which a partition is in the process of being detached. This also adds the restriction that only one partition can be in pending-detach state for a partitioned table. While at it, return find_inheritance_children() API to what it was before 71f4c8c6f74b, and create a separate find_inheritance_children_extended() that returns detailed info about detached partitions. (This incidentally fixes a bug in 8aba9322511 whereby a memory context holding a transient partdesc is reparented to a NULL PortalContext, leading to permanent leak of that memory. The fix is to no longer rely on reparenting contexts to PortalContext. Reported by Amit Langote.) Per gripe from Amit Langote Discussion: https://postgr.es/m/CA+HiwqFgpP1LxJZOBYGt9rpvTjXXkg5qG2+Xch2Z1Q7KrqZR1A@mail.gmail.com
2021-04-27Fix Logical Replication of Truncate in synchronous commit mode.Amit Kapila
The Truncate operation acquires an exclusive lock on the target relation and indexes. It then waits for logical replication of the operation to finish at commit. Now because we are acquiring the shared lock on the target index to get index attributes in pgoutput while sending the changes for the Truncate operation, it leads to a deadlock. Actually, we don't need to acquire a lock on the target index as we build the cache entry using a historic snapshot and all the later changes are absorbed while decoding WAL. So, we wrote a special purpose function for logical replication to get a bitmap of replica identity attribute numbers where we get that information without locking the target index. We decided not to backpatch this as there doesn't seem to be any field complaint about this issue since it was introduced in commit 5dfd1e5a in v11. Reported-by: Haiying Tang Author: Takamichi Osumi, test case by Li Japin Reviewed-by: Amit Kapila, Ajin Cherian Discussion: https://postgr.es/m/OS0PR01MB6113C2499C7DC70EE55ADB82FB759@OS0PR01MB6113.jpnprd01.prod.outlook.com
2021-04-06Clean up treatment of missing default and CHECK-constraint records.Tom Lane
Andrew Gierth reported that it's possible to crash the backend if no pg_attrdef record is found to match an attribute that has atthasdef set. AttrDefaultFetch warns about this situation, but then leaves behind a relation tupdesc that has null "adbin" pointer(s), which most places don't guard against. We considered promoting the warning to an error, but throwing errors during relcache load is pretty drastic: it effectively locks one out of using the relation at all. What seems better is to leave the load-time behavior as a warning, but then throw an error in any code path that wants to use a default and can't find it. This confines the error to a subset of INSERT/UPDATE operations on the table, and in particular will at least allow a pg_dump to succeed. Also, we should fix AttrDefaultFetch to not leave any null pointers in the tupdesc, because that just creates an untested bug hazard. While at it, apply the same philosophy of "warn at load, throw error only upon use of the known-missing info" to CHECK constraints. CheckConstraintFetch is very nearly the same logic as AttrDefaultFetch, but for reasons lost in the mists of time, it was throwing ERROR for the same cases that AttrDefaultFetch treats as WARNING. Make the two functions more nearly alike. In passing, get rid of potentially-O(N^2) loops in equalTupleDesc by making AttrDefaultFetch sort the entries after fetching them, so that equalTupleDesc can assume that entries in two equal tupdescs must be in matching order. (CheckConstraintFetch already was sorting CHECK constraints, but equalTupleDesc hadn't been told about it.) There's some argument for back-patching this, but with such a small number of field reports, I'm content to fix it in HEAD. Discussion: https://postgr.es/m/87pmzaq4gx.fsf@news-spur.riddles.org.uk
2021-03-26Add "pg_database_owner" default role.Noah Misch
Membership consists, implicitly, of the current database owner. Expect use in template databases. Once pg_database_owner has rights within a template, each owner of a database instantiated from that template will exercise those rights. Reviewed by John Naylor. Discussion: https://postgr.es/m/20201228043148.GA1053024@rfd.leadboat.com
2021-03-25ALTER TABLE ... DETACH PARTITION ... CONCURRENTLYAlvaro Herrera
Allow a partition be detached from its partitioned table without blocking concurrent queries, by running in two transactions and only requiring ShareUpdateExclusive in the partitioned table. Because it runs in two transactions, it cannot be used in a transaction block. This is the main reason to use dedicated syntax: so that users can choose to use the original mode if they need it. But also, it doesn't work when a default partition exists (because an exclusive lock would still need to be obtained on it, in order to change its partition constraint.) In case the second transaction is cancelled or a crash occurs, there's ALTER TABLE .. DETACH PARTITION .. FINALIZE, which executes the final steps. The main trick to make this work is the addition of column pg_inherits.inhdetachpending, initially false; can only be set true in the first part of this command. Once that is committed, concurrent transactions that use a PartitionDirectory will include or ignore partitions so marked: in optimizer they are ignored if the row is marked committed for the snapshot; in executor they are always included. As a result, and because of the way PartitionDirectory caches partition descriptors, queries that were planned before the detach will see the rows in the detached partition and queries that are planned after the detach, won't. A CHECK constraint is created that duplicates the partition constraint. This is probably not strictly necessary, and some users will prefer to remove it afterwards, but if the partition is re-attached to a partitioned table, the constraint needn't be rechecked. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/20200803234854.GA24158@alvherre.pgsql
2021-03-24Revert "Enable parallel SELECT for "INSERT INTO ... SELECT ..."."Amit Kapila
To allow inserts in parallel-mode this feature has to ensure that all the constraints, triggers, etc. are parallel-safe for the partition hierarchy which is costly and we need to find a better way to do that. Additionally, we could have used existing cached information in some cases like indexes, domains, etc. to determine the parallel-safety. List of commits reverted, in reverse chronological order: ed62d3737c Doc: Update description for parallel insert reloption. c8f78b6161 Add a new GUC and a reloption to enable inserts in parallel-mode. c5be48f092 Improve FK trigger parallel-safety check added by 05c8482f7f. e2cda3c20a Fix use of relcache TriggerDesc field introduced by commit 05c8482f7f. e4e87a32cc Fix valgrind issue in commit 05c8482f7f. 05c8482f7f Enable parallel SELECT for "INSERT INTO ... SELECT ...". Discussion: https://postgr.es/m/E1lMiB9-0001c3-SY@gemulon.postgresql.org
2021-03-22Add macro RelationIsPermanent() to report relation permanenceBruce Momjian
Previously, to check relation permanence, the Relation's Form_pg_class structure member relpersistence was compared to the value RELPERSISTENCE_PERMANENT ("p"). This commit adds the macro RelationIsPermanent() and is used in appropirate places to simplify the code. This matches other RelationIs* macros. This macro will be used in more places in future cluster file encryption patches. Discussion: https://postgr.es/m/20210318153134.GH20766@tamriel.snowman.net
2021-03-18Don't run RelationInitTableAccessMethod in a long-lived context.Tom Lane
Some code paths in this function perform syscache lookups, which can lead to table accesses and possibly leakage of cruft into the caller's context. If said context is CacheMemoryContext, we eventually will have visible bloat. But fixing this is no harder than moving one memory context switch step. (The other callers don't have a problem.) Andres Freund and I independently found this via valgrind testing. Back-patch to v12 where this code was added. Discussion: https://postgr.es/m/20210317023101.anvejcfotwka6gaa@alap3.anarazel.de Discussion: https://postgr.es/m/3816764.1616104288@sss.pgh.pa.us
2021-03-18Don't leak rd_statlist when a relcache entry is dropped.Tom Lane
Although these lists are usually NIL, and even when not empty are unlikely to be large, constant relcache update traffic could eventually result in visible bloat of CacheMemoryContext. Found via valgrind testing. Back-patch to v10 where this field was added. Discussion: https://postgr.es/m/3816764.1616104288@sss.pgh.pa.us
2021-03-10Enable parallel SELECT for "INSERT INTO ... SELECT ...".Amit Kapila
Parallel SELECT can't be utilized for INSERT in the following cases: - INSERT statement uses the ON CONFLICT DO UPDATE clause - Target table has a parallel-unsafe: trigger, index expression or predicate, column default expression or check constraint - Target table has a parallel-unsafe domain constraint on any column - Target table is a partitioned table with a parallel-unsafe partition key expression or support function The planner is updated to perform additional parallel-safety checks for the cases listed above, for determining whether it is safe to run INSERT in parallel-mode with an underlying parallel SELECT. The planner will consider using parallel SELECT for "INSERT INTO ... SELECT ...", provided nothing unsafe is found from the additional parallel-safety checks, or from the existing parallel-safety checks for SELECT. While checking parallel-safety, we need to check it for all the partitions on the table which can be costly especially when we decide not to use a parallel plan. So, in a separate patch, we will introduce a GUC and or a reloption to enable/disable parallelism for Insert statements. Prior to entering parallel-mode for the execution of INSERT with parallel SELECT, a TransactionId is acquired and assigned to the current transaction state. This is necessary to prevent the INSERT from attempting to assign the TransactionId whilst in parallel-mode, which is not allowed. This approach has a disadvantage in that if the underlying SELECT does not return any rows, then the TransactionId is not used, however that shouldn't happen in practice in many cases. Author: Greg Nancarrow, Amit Langote, Amit Kapila Reviewed-by: Amit Langote, Hou Zhijie, Takayuki Tsunakawa, Antonin Houska, Bharath Rupireddy, Dilip Kumar, Vignesh C, Zhihong Yu, Amit Kapila Tested-by: Tang, Haiying Discussion: https://postgr.es/m/CAJcOf-cXnB5cnMKqWEp2E2z7Mvcd04iLVmV=qpFJrR3AcrTS3g@mail.gmail.com Discussion: https://postgr.es/m/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com
2021-02-24Fix some typos, grammar and style in docs and commentsMichael Paquier
The portions fixing the documentation are backpatched where needed. Author: Justin Pryzby Discussion: https://postgr.es/m/20210210235557.GQ20012@telsasoft.com backpatch-through: 9.6
2021-02-10Preserve pg_attribute.attstattarget across REINDEX CONCURRENTLYMichael Paquier
For an index, attstattarget can be updated using ALTER INDEX SET STATISTICS. This data was lost on the new index after REINDEX CONCURRENTLY. The update of this field is done when the old and new indexes are swapped to make the fix back-patchable. Another approach we could look after in the long-term is to change index_create() to pass the wanted values of attstattarget when creating the new relation, but, as this would cause an ABI breakage this can be done only on HEAD. Reported-by: Ronan Dunklau Author: Michael Paquier Reviewed-by: Ronan Dunklau, Tomas Vondra Discussion: https://postgr.es/m/16628084.uLZWGnKmhe@laptop-ronand Backpatch-through: 12
2021-01-25Improve performance of repeated CALLs within plpgsql procedures.Tom Lane
This patch essentially is cleaning up technical debt left behind by the original implementation of plpgsql procedures, particularly commit d92bc83c4. That patch (or more precisely, follow-on patches fixing its worst bugs) forced us to re-plan CALL and DO statements each time through, if we're in a non-atomic context. That wasn't for any fundamental reason, but just because use of a saved plan requires having a ResourceOwner to hold a reference count for the plan, and we had no suitable resowner at hand, nor would the available APIs support using one if we did. While it's not that expensive to create a "plan" for CALL/DO, the cycles do add up in repeated executions. This patch therefore makes the following API changes: * GetCachedPlan/ReleaseCachedPlan are modified to let the caller specify which resowner to use to pin the plan, rather than forcing use of CurrentResourceOwner. * spi.c gains a "SPI_execute_plan_extended" entry point that lets callers say which resowner to use to pin the plan. This borrows the idea of an options struct from the recently added SPI_prepare_extended, hopefully allowing future options to be added without more API breaks. This supersedes SPI_execute_plan_with_paramlist (which I've marked deprecated) as well as SPI_execute_plan_with_receiver (which is new in v14, so I just took it out altogether). * I also took the opportunity to remove the crude hack of letting plpgsql reach into SPI private data structures to mark SPI plans as "no_snapshot". It's better to treat that as an option of SPI_prepare_extended. Now, when running a non-atomic procedure or DO block that contains any CALL or DO commands, plpgsql creates a ResourceOwner that will be used to pin the plans of the CALL/DO commands. (In an atomic context, we just use CurrentResourceOwner, as before.) Having done this, we can just save CALL/DO plans normally, whether or not they are used across transaction boundaries. This seems to be good for something like 2X speedup of a CALL of a trivial procedure with a few simple argument expressions. By restricting the creation of an extra ResourceOwner like this, there's essentially zero penalty in cases that can't benefit. Pavel Stehule, with some further hacking by me Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
2021-01-13Fix routine name in comment of catcache.cMichael Paquier
Author: Bharath Rupireddy Discussion: https://postgr.es/m/CALj2ACUDXLAkf_XxQO9tAUtnTNGi3Lmd8fANd+vBJbcHn1HoWA@mail.gmail.com
2021-01-06Replace CLOBBER_CACHE_ALWAYS with run-time GUCPeter Eisentraut
Forced cache invalidation (CLOBBER_CACHE_ALWAYS) has been impractical to use for testing in PostgreSQL because it's so slow and because it's toggled on/off only at build time. It is helpful when hunting bugs in any code that uses the sycache/relcache because causes cache invalidations to be injected whenever it would be possible for an invalidation to occur, whether or not one was really pending. Address this by providing run-time control over cache clobber behaviour using the new debug_invalidate_system_caches_always GUC. Support is not compiled in at all unless assertions are enabled or CLOBBER_CACHE_ENABLED is explicitly defined at compile time. It defaults to 0 if compiled in, so it has negligible effect on assert build performance by default. When support is compiled in, test code can now set debug_invalidate_system_caches_always=1 locally to a backend to test specific queries, functions, extensions, etc. Or tests can toggle it globally for a specific test case while retaining normal performance during test setup and teardown. For backwards compatibility with existing test harnesses and scripts, debug_invalidate_system_caches_always defaults to 1 if CLOBBER_CACHE_ALWAYS is defined, and to 3 if CLOBBER_CACHE_RECURSIVE is defined. CLOBBER_CACHE_ENABLED is now visible in pg_config_manual.h, as is the related RECOVER_RELATION_BUILD_MEMORY setting for the relcache. Author: Craig Ringer <craig.ringer@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/CAMsr+YF=+ctXBZj3ywmvKNUjWpxmuTuUKuv-rgbHGX5i5pLstQ@mail.gmail.com
2021-01-02Update copyright for 2021Bruce Momjian
Backpatch-through: 9.5
2020-12-28Fix inconsistent code with shared invalidations of snapshotsMichael Paquier
The code in charge of processing a single invalidation message has been using since 568d413 the structure for relation mapping messages. This had fortunately no consequence as both locate the database ID at the same location, but it could become a problem in the future if this area of the code changes. Author: Konstantin Knizhnik Discussion: https://postgr.es/m/8044c223-4d3a-2cdb-42bf-29940840ce94@postgrespro.ru Backpatch-through: 9.5
2020-12-20Multirange datatypesAlexander Korotkov
Multiranges are basically sorted arrays of non-overlapping ranges with set-theoretic operations defined over them. Since v14, each range type automatically gets a corresponding multirange datatype. There are both manual and automatic mechanisms for naming multirange types. Once can specify multirange type name using multirange_type_name attribute in CREATE TYPE.  Otherwise, a multirange type name is generated automatically. If the range type name contains "range" then we change that to "multirange". Otherwise, we add "_multirange" to the end. Implementation of multiranges comes with a space-efficient internal representation format, which evades extra paddings and duplicated storage of oids.  Altogether this format allows fetching a particular range by its index in O(n). Statistic gathering and selectivity estimation are implemented for multiranges. For this purpose, stored multirange is approximated as union range without gaps. This field will likely need improvements in the future. Catversion is bumped. Discussion: https://postgr.es/m/CALNJ-vSUpQ_Y%3DjXvTxt1VYFztaBSsWVXeF1y6gTYQ4bOiWDLgQ%40mail.gmail.com Discussion: https://postgr.es/m/a0b8026459d1e6167933be2104a6174e7d40d0ab.camel%40j-davis.com#fe7218c83b08068bfffb0c5293eceda0 Author: Paul Jungwirth, revised by me Reviewed-by: David Fetter, Corey Huinker, Jeff Davis, Pavel Stehule Reviewed-by: Alvaro Herrera, Tom Lane, Isaac Morland, David G. Johnston Reviewed-by: Zhihong Yu, Alexander Korotkov
2020-12-18Avoid memcpy() with same source and destination during relmapper init.Tom Lane
A narrow reading of the C standard says that memcpy(x,x,n) is undefined, although it's hard to envision an implementation that would really misbehave. However, analysis tools such as valgrind might whine about this; accordingly, let's band-aid relmapper.c to not do it. See also 5b630501e, d3f4e8a8a, ad7b48ea0, and other similar fixes. Apparently, none of those folk tried valgrinding initdb? This has been like this for long enough that I'm surprised it hasn't been reported before. Back-patch, just in case anybody wants to use a back branch on a platform that complains about this; we back-patched those earlier fixes too. Discussion: https://postgr.es/m/161790.1608310142@sss.pgh.pa.us
2020-12-15Improve hash_create()'s API for some added robustness.Tom Lane
Invent a new flag bit HASH_STRINGS to specify C-string hashing, which was formerly the default; and add assertions insisting that exactly one of the bits HASH_STRINGS, HASH_BLOBS, and HASH_FUNCTION be set. This is in hopes of preventing recurrences of the type of oversight fixed in commit a1b8aa1e4 (i.e., mistakenly omitting HASH_BLOBS). Also, when HASH_STRINGS is specified, insist that the keysize be more than 8 bytes. This is a heuristic, but it should catch accidental use of HASH_STRINGS for integer or pointer keys. (Nearly all existing use-cases set the keysize to NAMEDATALEN or more, so there's little reason to think this restriction should be problematic.) Tweak hash_create() to insist that the HASH_ELEM flag be set, and remove the defaults it had for keysize and entrysize. Since those defaults were undocumented and basically useless, no callers omitted HASH_ELEM anyway. Also, remove memset's zeroing the HASHCTL parameter struct from those callers that had one. This has never been really necessary, and while it wasn't a bad coding convention it was confusing that some callers did it and some did not. We might as well save a few cycles by standardizing on "not". Also improve the documentation for hash_create(). In passing, improve reinit.c's usage of a hash table by storing the key as a binary Oid rather than a string; and, since that's a temporary hash table, allocate it in CurrentMemoryContext for neatness. Discussion: https://postgr.es/m/590625.1607878171@sss.pgh.pa.us
2020-12-11Provide an error cursor for "can't subscript" error messages.Tom Lane
Commit c7aba7c14 didn't add this, but after more fooling with the feature I feel that it'd be useful. To make this possible, refactor getSubscriptingRoutines() so that the caller is responsible for throwing any error. (In clauses.c, I just chose to make the most conservative assumption rather than throwing an error. We don't expect failures there anyway really, so the code space for an error message would be a poor investment.)
2020-12-09Support subscripting of arbitrary types, not only arrays.Tom Lane
This patch generalizes the subscripting infrastructure so that any data type can be subscripted, if it provides a handler function to define what that means. Traditional variable-length (varlena) arrays all use array_subscript_handler(), while the existing fixed-length types that support subscripting use raw_array_subscript_handler(). It's expected that other types that want to use subscripting notation will define their own handlers. (This patch provides no such new features, though; it only lays the foundation for them.) To do this, move the parser's semantic processing of subscripts (including coercion to whatever data type is required) into a method callback supplied by the handler. On the execution side, replace the ExecEvalSubscriptingRef* layer of functions with direct calls to callback-supplied execution routines. (Thus, essentially no new run-time overhead should be caused by this patch. Indeed, there is room to remove some overhead by supplying specialized execution routines. This patch does a little bit in that line, but more could be done.) Additional work is required here and there to remove formerly hard-wired assumptions about the result type, collation, etc of a SubscriptingRef expression node; and to remove assumptions that the subscript values must be integers. One useful side-effect of this is that we now have a less squishy mechanism for identifying whether a data type is a "true" array: instead of wiring in weird rules about typlen, we can look to see if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER. For this to be bulletproof, we have to forbid user-defined types from using that handler directly; but there seems no good reason for them to do so. This patch also removes assumptions that the number of subscripts is limited to MAXDIM (6), or indeed has any hard-wired limit. That limit still applies to types handled by array_subscript_handler or raw_array_subscript_handler, but to discourage other dependencies on this constant, I've moved it from c.h to utils/array.h. Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov, Peter Eisentraut, Pavel Stehule Discussion: https://postgr.es/m/CA+q6zcVDuGBv=M0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w@mail.gmail.com Discussion: https://postgr.es/m/CA+q6zcVovR+XY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA@mail.gmail.com
2020-12-09Change get_constraint_index() to use pg_constraint.conindidPeter Eisentraut
It was still using a scan of pg_depend instead of using the conindid column that has been added since. Since it is now just a catalog lookup wrapper and not related to pg_depend, move from pg_depend.c to lsyscache.c. Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/4688d55c-9a2e-9a5a-d166-5f24fe0bf8db%40enterprisedb.com
2020-11-19Hash support for row typesPeter Eisentraut
Add hash functions for the record type as well as a hash operator family and operator class for the record type. This enables all the hash functionality for the record type such as hash-based plans for UNION/INTERSECT/EXCEPT DISTINCT, recursive queries using UNION DISTINCT, hash joins, and hash partitioning. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/38eccd35-4e2d-6767-1b3c-dada1eac3124%402ndquadrant.com
2020-11-07Move catalog index declarationsPeter Eisentraut
Move the system catalog index declarations from catalog/indexing.h to the respective parent tables' catalog/pg_*.h files. The original reason for having it split was that the old genbki system produced the output in the order of the catalog files it read, so all the indexing stuff needed to come separately. But this is no longer the case, and keeping it together makes more sense. Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/c7cc82d6-f976-75d6-2e3e-b03d2cab26bb@2ndquadrant.com
2020-11-03Track collation versions for indexes.Thomas Munro
Record the current version of dependent collations in pg_depend when creating or rebuilding an index. When accessing the index later, warn that the index may be corrupted if the current version doesn't match. Thanks to Douglas Doole, Peter Eisentraut, Christoph Berg, Laurenz Albe, Michael Paquier, Robert Haas, Tom Lane and others for very helpful discussion. Author: Thomas Munro <thomas.munro@gmail.com> Author: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> (earlier versions) Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
2020-10-28Don't use custom OID symbols in pg_proc.dat.Tom Lane
We have a perfectly good convention for OID macros for built-in functions already, so making custom symbols is just introducing unnecessary deviation from the convention. Remove the one case that had snuck in, and add an error check in genbki.pl to discourage future instances. Although this touches pg_proc.dat, there's no need for a catversion bump since the actual catalog data isn't changed. John Naylor Discussion: https://postgr.es/m/CAFBsxsHpCbjfoddNGpnnnY5pHwckWfiYkMYSF74PmP1su0+ZOw@mail.gmail.com
2020-10-25Fix incorrect parameter name in a function header commentDavid Rowley
Author: Zhijie Hou Discussion: https://postgr.es/m/14cd74ea00204cc8a7ea5d738ac82cd1@G08CNEXMBPEKD05.g08.fujitsu.local Backpatch-through: 12, where the mistake was introduced
2020-09-09Fix rd_firstRelfilenodeSubid for nailed relations, in parallel workers.Noah Misch
Move applicable code out of RelationBuildDesc(), which nailed relations bypass. Non-assert builds experienced no known problems. Back-patch to v13, where commit c6b92041d38512a4176ed76ad06f713d2e6c01a8 introduced rd_firstRelfilenodeSubid. Kyotaro Horiguchi. Reported by Justin Pryzby. Discussion: https://postgr.es/m/20200907023737.GA7158@telsasoft.com
2020-08-30Redefine pg_class.reltuples to be -1 before the first VACUUM or ANALYZE.Tom Lane
Historically, we've considered the state with relpages and reltuples both zero as indicating that we do not know the table's tuple density. This is problematic because it's impossible to distinguish "never yet vacuumed" from "vacuumed and seen to be empty". In particular, a user cannot use VACUUM or ANALYZE to override the planner's normal heuristic that an empty table should not be believed to be empty because it is probably about to get populated. That heuristic is a good safety measure, so I don't care to abandon it, but there should be a way to override it if the table is indeed intended to stay empty. Hence, represent the initial state of ignorance by setting reltuples to -1 (relpages is still set to zero), and apply the minimum-ten-pages heuristic only when reltuples is still -1. If the table is empty, VACUUM or ANALYZE (but not CREATE INDEX) will override that to reltuples = relpages = 0, and then we'll plan on that basis. This requires a bunch of fiddly little changes, but we can get rid of some ugly kluges that were formerly needed to maintain the old definition. One notable point is that FDWs' GetForeignRelSize methods will see baserel->tuples = -1 when no ANALYZE has been done on the foreign table. That seems like a net improvement, since those methods were formerly also in the dark about what baserel->tuples = 0 really meant. Still, it is an API change. I bumped catversion because code predating this change would get confused by seeing reltuples = -1. Discussion: https://postgr.es/m/F02298E0-6EF4-49A1-BCB6-C484794D9ACC@thebuild.com
2020-07-23WAL Log invalidations at command end with wal_level=logical.Amit Kapila
When wal_level=logical, write invalidations at command end into WAL so that decoding can use this information. This patch is required to allow the streaming of in-progress transactions in logical decoding.  The actual work to allow streaming will be committed as a separate patch. We still add the invalidations to the cache and write them to WAL at commit time in RecordTransactionCommit(). This uses the existing XLOG_INVALIDATIONS xlog record type, from the RM_STANDBY_ID resource manager (see LogStandbyInvalidations for details). So existing code relying on those invalidations (e.g. redo) does not need to be changed. The invalidations written at command end uses a new xlog record type XLOG_XACT_INVALIDATIONS, from RM_XACT_ID resource manager. See LogLogicalInvalidations for details. These new xlog records are ignored by existing redo procedures, which still rely on the invalidations written to commit records. The invalidations are decoded and accumulated in top-transaction, and then executed during replay.  This obviates the need to decode the invalidations as part of a commit record. Bump XLOG_PAGE_MAGIC, since this introduces XLOG_XACT_INVALIDATIONS. Author: Dilip Kumar, Tomas Vondra, Amit Kapila Reviewed-by: Amit Kapila Tested-by: Neha Sharma and Mahendra Singh Thalor Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
2020-07-20Add generic_plans and custom_plans fields into pg_prepared_statements.Fujii Masao
There was no easy way to find how many times generic and custom plans have been executed for a prepared statement. This commit exposes those numbers of times in pg_prepared_statements view. Author: Atsushi Torikoshi, Kyotaro Horiguchi Reviewed-by: Tatsuro Yamada, Masahiro Ikeda, Fujii Masao Discussion: https://postgr.es/m/CACZ0uYHZ4M=NZpofH6JuPHeX=__5xcDELF8hT8_2T+R55w4RQw@mail.gmail.com
2020-07-08Fix function name in comment.Fujii Masao
Author: Masahiro Ikeda Discussion: https://postgr.es/m/0043eee90b38351ea199d7e3294c10c4@oss.nttdata.com
2020-07-07Don't create pg_type entries for sequences or toast tables.Tom Lane
Commit f7f70d5e2 left one inconsistency behind: we're still creating pg_type entries for the composite types of sequences and toast tables, but not arrays over those composites. But there seems precious little reason to have named composite types for toast tables, and not much more to have them for sequences (especially given the thought that sequences may someday not be standalone relations at all). So, let's close that inconsistency by removing these composite types, rather than adding arrays for them. This buys back a little bit of the initial pg_type bloat added by the previous patch, and could be a significant savings in a large database with many toast tables. Aside from a small logic rearrangement in heap_create_with_catalog, this patch mostly needs to clean up some places that were assuming that pg_class.reltype always has a valid value. Those are really pre-existing bugs, given that it's documented otherwise; notably, the plpgsql changes fix code that gives "cache lookup failed for type 0" on indexes today. But none of these seem interesting enough to back-patch. Also, remove the pg_dump/pg_upgrade infrastructure for propagating a toast table's pg_type OID into the new database, since we no longer need that. Discussion: https://postgr.es/m/761F1389-C6A8-4C15-80CE-950C961F5341@gmail.com
2020-06-07Spelling adjustmentsPeter Eisentraut
2020-06-05Improve ineq_histogram_selectivity's behavior for non-default orderings.Tom Lane
ineq_histogram_selectivity() can be invoked in situations where the ordering we care about is not that of the column's histogram. We could be considering some other collation, or even more drastically, the query operator might not agree at all with what was used to construct the histogram. (We'll get here for anything using scalarineqsel-based estimators, so that's quite likely to happen for extension operators.) Up to now we just ignored this issue and assumed we were dealing with an operator/collation whose sort order exactly matches the histogram, possibly resulting in junk estimates if the binary search gets confused. It's past time to improve that, since the use of nondefault collations is increasing. What we can do is verify that the given operator and collation match what's recorded in pg_statistic, and use the existing code only if so. When they don't match, instead execute the operator against each histogram entry, and take the fraction of successes as our selectivity estimate. This gives an estimate that is probably good to about 1/histogram_size, with no assumptions about ordering. (The quality of the estimate is likely to degrade near the ends of the value range, since the two orderings probably don't agree on what is an extremal value; but this is surely going to be more reliable than what we did before.) At some point we might further improve matters by storing more than one histogram calculated according to different orderings. But this code would still be good fallback logic when no matches exist, so that is not an argument for not doing this. While here, also improve get_variable_range() to deal more honestly with non-default collations. This isn't back-patchable, because it requires adding another argument to ineq_histogram_selectivity, and because it might have significant impact on the estimation results for extension operators relying on scalarineqsel --- mostly for the better, one hopes, but in any case destabilizing plan choices in back branches is best avoided. Per investigation of a report from James Lucas. Discussion: https://postgr.es/m/CAAFmbbOvfi=wMM=3qRsPunBSLb8BFREno2oOzSBS=mzfLPKABw@mail.gmail.com
2020-05-15Rename assorted LWLock tranches.Tom Lane
Choose names that fit into the conventions for wait event names (particularly, that multi-word names are in the style MultiWordName) and hopefully convey more information to non-hacker users than the previous names did. Also rename SerializablePredicateLockListLock to SerializablePredicateListLock; the old name was long enough to cause table formatting problems, plus the double occurrence of "Lock" seems confusing/error-prone. Also change a couple of particularly opaque LWLock field names. Discussion: https://postgr.es/m/28683.1589405363@sss.pgh.pa.us
2020-05-14Initial pgindent and pgperltidy run for v13.Tom Lane
Includes some manual cleanup of places that pgindent messed up, most of which weren't per project style anyway. Notably, it seems some people didn't absorb the style rules of commit c9d297751, because there were a bunch of new occurrences of function calls with a newline just after the left paren, all with faulty expectations about how the rest of the call would get indented.
2020-04-08Allow publishing partition changes via ancestorsPeter Eisentraut
To control whether partition changes are replicated using their own identity and schema or an ancestor's, add a new parameter that can be set per publication named 'publish_via_partition_root'. This allows replicating a partitioned table into a different partition structure on the subscriber. Author: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Reviewed-by: Petr Jelinek <petr@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/CA+HiwqH=Y85vRK3mOdjEkqFK+E=ST=eQiHdpj43L=_eJMOOznQ@mail.gmail.com
2020-04-06Preserve clustered index after rewrites with ALTER TABLEMichael Paquier
A table rewritten by ALTER TABLE would lose tracking of an index usable for CLUSTER. This setting is tracked by pg_index.indisclustered and is controlled by ALTER TABLE, so some extra work was needed to restore it properly. Note that ALTER TABLE only marks the index that can be used for clustering, and does not do the actual operation. Author: Amit Langote, Justin Pryzby Reviewed-by: Ibrar Ahmed, Michael Paquier Discussion: https://postgr.es/m/20200202161718.GI13621@telsasoft.com Backpatch-through: 9.5
2020-04-05Fix recently introduced typo.Andres Freund
Reported-By: David Rowley
2020-04-04Skip WAL for new relfilenodes, under wal_level=minimal.Noah Misch
Until now, only selected bulk operations (e.g. COPY) did this. If a given relfilenode received both a WAL-skipping COPY and a WAL-logged operation (e.g. INSERT), recovery could lose tuples from the COPY. See src/backend/access/transam/README section "Skipping WAL for New RelFileNode" for the new coding rules. Maintainers of table access methods should examine that section. To maintain data durability, just before commit, we choose between an fsync of the relfilenode and copying its contents to WAL. A new GUC, wal_skip_threshold, guides that choice. If this change slows a workload that creates small, permanent relfilenodes under wal_level=minimal, try adjusting wal_skip_threshold. Users setting a timeout on COMMIT may need to adjust that timeout, and log_min_duration_statement analysis will reflect time consumption moving to COMMIT from commands like COPY. Internally, this requires a reliable determination of whether RollbackAndReleaseCurrentSubTransaction() would unlink a relation's current relfilenode. Introduce rd_firstRelfilenodeSubid. Amend the specification of rd_createSubid such that the field is zero when a new rel has an old rd_node. Make relcache.c retain entries for certain dropped relations until end of transaction. Bump XLOG_PAGE_MAGIC, since this introduces XLOG_GIST_ASSIGN_LSN. Future servers accept older WAL, so this bump is discretionary. Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert Haas. Heikki Linnakangas and Michael Paquier implemented earlier designs that materially clarified the problem. Reviewed, in earlier designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane, Fujii Masao, and Simon Riggs. Reported by Martijn van Oosterhout. Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org
2020-03-30Remove rudiments of supporting procnum == 0 from 911e702077Alexander Korotkov
Early versions of opclass options patch uses zero support procedure as opclass options procedure. This commit removes rudiments of it, which were committed in 911e702077. Also, it implements correct handling of amoptsprocnum == 0.
2020-03-30Implement operator class parametersAlexander Korotkov
PostgreSQL provides set of template index access methods, where opclasses have much freedom in the semantics of indexing. These index AMs are GiST, GIN, SP-GiST and BRIN. There opclasses define representation of keys, operations on them and supported search strategies. So, it's natural that opclasses may be faced some tradeoffs, which require user-side decision. This commit implements opclass parameters allowing users to set some values, which tell opclass how to index the particular dataset. This commit doesn't introduce new storage in system catalog. Instead it uses pg_attribute.attoptions, which is used for table column storage options but unused for index attributes. In order to evade changing signature of each opclass support function, we implement unified way to pass options to opclass support functions. Options are set to fn_expr as the constant bytea expression. It's possible due to the fact that opclass support functions are executed outside of expressions, so fn_expr is unused for them. This commit comes with some examples of opclass options usage. We parametrize signature length in GiST. That applies to multiple opclasses: tsvector_ops, gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and gist_hstore_ops. Also we parametrize maximum number of integer ranges for gist__int_ops. However, the main future usage of this feature is expected to be json, where users would be able to specify which way to index particular json parts. Catversion is bumped. Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru Author: Nikita Glukhov, revised by me Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30Allow the planner-related functions and hook to accept the query string.Fujii Masao
This commit adds query_string argument into the planner-related functions and hook and allows us to pass the query string to them. Currently there is no user of the query string passed. But the upcoming patch for the planning counters will add the planning hook function into pg_stat_statements and the function will need the query string. So this change will be necessary for that patch. Also this change is useful for some extensions that want to use the query string in their planner hook function. Author: Pascal Legrand, Julien Rouhaud Reviewed-by: Yoshikazu Imai, Tom Lane, Fujii Masao Discussion: https://postgr.es/m/CAOBaU_bU1m3_XF5qKYtSj1ua4dxd=FWDyh2SH4rSJAUUfsGmAQ@mail.gmail.com Discussion: https://postgr.es/m/1583789487074-0.post@n3.nabble.com
2020-03-28Ensure snapshot is registered within ScanPgRelation().Andres Freund
In 9.4 I added support to use a historical snapshot in ScanPgRelation(), while adding logical decoding. Unfortunately a conflict with the concurrent removal of SnapshotNow was incorrectly resolved, leading to an unregistered snapshot being used. It is not correct to use an unregistered (or non-active) snapshot for anything non-trivial, because catalog invalidations can cause the snapshot to be invalidated. Luckily it seems unlikely to actively cause problems in practice, as ScanPgRelation() requires that we already have a lock on the relation, we only look for a single row, and we don't appear to rely on the result's tid to be correct. It however is clearly wrong and potential negative consequences would likely be hard to find. So it seems worth backpatching the fix, even without a concrete hazard. Discussion: https://postgr.es/m/20200229052459.wzhqnbhrriezg4v2@alap3.anarazel.de Backpatch: 9.5-
2020-03-27Rearrange validity checks for plpgsql "simple" expressions.Tom Lane
Buildfarm experience shows what probably should've occurred to me before: if a cache flush occurs partway through building a generic plan, then the plansource may have is_valid = false even though the plan is valid. We need to accept this case, use the generated plan, and then try to replan the next time. We can't try to replan immediately, because that would produce an infinite loop in CLOBBER_CACHE_ALWAYS builds; moreover it's really overkill. (We can assume that the plan is valid, it's just possibly a bit stale. Note that the pre-existing code behaved this way, and the non-simple-expression code paths do too.) Conversely, not using the generated plan would drop us into the not-a-simple-expression code path, which is bad for performance and would also cause regression-test failures due to visibly different error-reporting behavior. Hence, refactor the validity-check functions so that the initial check and recheck cases can react differently to plansource->is_valid. This makes their usage a bit simpler, too. Discussion: https://postgr.es/m/7072.1585332104@sss.pgh.pa.us
2020-03-26Improve performance of "simple expressions" in PL/pgSQL.Tom Lane
For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's management overhead exceeds the cost of evaluating the expression. This patch substantially improves that situation, providing roughly 2X speedup for such trivial expressions. First, add infrastructure in the plancache to allow fast re-validation of cached plans that contain no table access, and hence need no locks. Teach plpgsql to use this infrastructure for expressions that it's already deemed "simple" (which in particular will never contain table references). The fast path still requires checking that search_path hasn't changed, so provide a fast path for OverrideSearchPathMatchesCurrent by counting changes that have occurred to the active search path in the current session. This is simplistic but seems enough for now, seeing that PushOverrideSearchPath is not used in any performance-critical cases. Second, manage the refcounts on simple expressions' cached plans using a transaction-lifespan resource owner, so that we only need to take and release an expression's refcount once per transaction not once per expression evaluation. The management of this resource owner exactly parallels the existing management of plpgsql's simple-expression EState. Add some regression tests covering this area, in particular verifying that expression caching doesn't break semantics for search_path changes. Patch by me, but it owes something to previous work by Amit Langote, who recognized that getting rid of plancache-related overhead would be a useful thing to do here. Also thanks to Andres Freund for review. Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com