summaryrefslogtreecommitdiff
path: root/src/include
AgeCommit message (Collapse)Author
2020-10-02Add pg_stat_wal statistics view.Fujii Masao
This view shows the statistics about WAL activity. Currently it has only two columns: wal_buffers_full and stats_reset. wal_buffers_full column indicates the number of times WAL data was written to the disk because WAL buffers got full. This information is useful when tuning wal_buffers. stats_reset column indicates the time at which these statistics were last reset. pg_stat_wal view is also the basic infrastructure to expose other various statistics about WAL activity later. Bump PGSTAT_FILE_FORMAT_ID due to the change in pgstat format. Bump catalog version. Author: Masahiro Ikeda Reviewed-by: Takayuki Tsunakawa, Kyotaro Horiguchi, Amit Kapila, Fujii Masao Discussion: https://postgr.es/m/188bd3f2d2233cf97753b5ced02bb050@oss.nttdata.com
2020-09-28Add for_each_from, to simplify loops starting from non-first list cells.Tom Lane
We have a dozen or so places that need to iterate over all but the first cell of a List. Prior to v13 this was typically written as for_each_cell(lc, lnext(list_head(list))) Commit 1cff1b95a changed these to for_each_cell(lc, list, list_second_cell(list)) This patch introduces a new macro for_each_from() which expresses the start point as a list index, allowing these to be written as for_each_from(lc, list, 1) This is marginally more efficient, since ForEachState.i can be initialized directly instead of backing into it from a ListCell address. It also seems clearer and less typo-prone. Some of the remaining uses of for_each_cell() look like they could profitably be changed to for_each_from(), but here I confined myself to changing uses of list_second_cell(). Also, fix for_each_cell_setup() and for_both_cell_setup() to const-ify their arguments; that's a simple oversight in 1cff1b95a. Back-patch into v13, on the grounds that (1) the const-ification is a minor bug fix, and (2) it's better for back-patching purposes if we only have two ways to write these loops rather than three. In HEAD, also remove list_third_cell() and list_fourth_cell(), which were also introduced in 1cff1b95a, and are unused as of cc99baa43. It seems unlikely that any third-party code would have started to use them already; anyone who has can be directed to list_nth_cell instead. Discussion: https://postgr.es/m/CAApHDvpo1zj9KhEpU2cCRZfSM3Q6XGdhzuAS2v79PH7WJBkYVA@mail.gmail.com
2020-09-29Revert "Change SHA2 implementation based on OpenSSL to use EVP digest routines"Michael Paquier
This reverts commit e21cbb4, as the switch to EVP routines requires a more careful design where we would need to have at least our wrapper routines return a status instead of issuing an error by themselves to let the caller do the error handling. The memory handling was also incorrect and could cause leaks in the backend if a failure happened, requiring most likely a callback to do the necessary cleanup as the only clean way to be able to allocate an EVP context requires the use of an allocation within OpenSSL. The potential rework of the wrappers also impacts the fallback implementation when not building with OpenSSL. Originally, prairiedog has reported a compilation failure, but after discussion with Tom Lane this needs a better design. Discussion: https://postgr.es/m/20200928073330.GC2316@paquier.xyz
2020-09-28Change SHA2 implementation based on OpenSSL to use EVP digest routinesMichael Paquier
The use of low-level hash routines is not recommended by upstream OpenSSL since 2000, and pgcrypto already switched to EVP as of 5ff4a67. Note that this also fixes a failure with SCRAM authentication when using FIPS in OpenSSL, but as there have been few complaints about this problem and as this causes an ABI breakage, no backpatch is done. Author: Michael Paquier, Alessandro Gherardi Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/20200924025314.GE7405@paquier.xyz Discussion: https://postgr.es/m/20180911030250.GA27115@paquier.xyz
2020-09-28Improve pg_list.h's linitial(), lsecond() and co macrosDavid Rowley
Prior to this commit, the linitial(), lsecond(), lthird(), lfourth() macros and their int and Oid list cousins would call their corresponding inlined function to fetch the cell of interest. Those inline functions were kind enough to return NULL if the particular cell did not exist. Unfortunately, the care that these functions took was of no relevance to the calling macros as they proceeded to directly dereference the returned value without any regard to whether that value was NULL or not. If it had been, we'd have segfaulted. Of course, the fact that we would have segfaulted on misuse of these macros just goes to prove that nobody is relying on the empty or list too small checks. So here we just get rid of those checks completely. The existing inline functions have been left alone as someone may be using those directly. We just replace the call within each macro to use list_nth_cell(). For the llast*() case we require a new list_last_cell() inline function to get away from the multiple evaluation hazard that we'd get if we fetched ->length on the macro's parameter. Author: David Rowley Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAApHDvpo1zj9KhEpU2cCRZfSM3Q6XGdhzuAS2v79PH7WJBkYVA@mail.gmail.com
2020-09-27Move resolution of AlternativeSubPlan choices to the planner.Tom Lane
When commit bd3daddaf introduced AlternativeSubPlans, I had some ambitions towards allowing the choice of subplan to change during execution. That has not happened, or even been thought about, in the ensuing twelve years; so it seems like a failed experiment. So let's rip that out and resolve the choice of subplan at the end of planning (in setrefs.c) rather than during executor startup. This has a number of positive benefits: * Removal of a few hundred lines of executor code, since AlternativeSubPlans need no longer be supported there. * Removal of executor-startup overhead (particularly, initialization of subplans that won't be used). * Removal of incidental costs of having a larger plan tree, such as tree-scanning and copying costs in the plancache; not to mention setrefs.c's own costs of processing the discarded subplans. * EXPLAIN no longer has to print a weird (and undocumented) representation of an AlternativeSubPlan choice; it sees only the subplan actually used. This should mean less confusion for users. * Since setrefs.c knows which subexpression of a plan node it's working on at any instant, it's possible to adjust the estimated number of executions of the subplan based on that. For example, we should usually estimate more executions of a qual expression than a targetlist expression. The implementation used here is pretty simplistic, because we don't want to expend a lot of cycles on the issue; but it's better than ignoring the point entirely, as the executor had to. That last point might possibly result in shifting the choice between hashed and non-hashed EXISTS subplans in a few cases, but in general this patch isn't meant to change planner choices. Since we're doing the resolution so late, it's really impossible to change any plan choices outside the AlternativeSubPlan itself. Patch by me; thanks to David Rowley for review. Discussion: https://postgr.es/m/1992952.1592785225@sss.pgh.pa.us
2020-09-26Fix the logical replication from HEAD to lower versions.Amit Kapila
Commit 464824323e changed the logical replication protocol to allow the streaming of in-progress transactions and used the new version of protocol irrespective of the server version. Use the appropriate version of the protocol based on the server version. Reported-by: Ashutosh Sharma Author: Dilip Kumar Reviewed-by: Ashutosh Sharma and Amit Kapila Discussion: https://postgr.es/m/CAE9k0P=9OpXcNrcU5Gsvd5MZ8GFpiN833vNHzX6Uc=8+h1ft1Q@mail.gmail.com
2020-09-25Defer flushing of SLRU files.Thomas Munro
Previously, we called fsync() after writing out individual pg_xact, pg_multixact and pg_commit_ts pages due to cache pressure, leading to regular I/O stalls in user backends and recovery. Collapse requests for the same file into a single system call as part of the next checkpoint, as we already did for relation files, using the infrastructure developed by commit 3eb77eba. This can cause a significant improvement to recovery performance, especially when it's otherwise CPU-bound. Hoist ProcessSyncRequests() up into CheckPointGuts() to make it clearer that it applies to all the SLRU mini-buffer-pools as well as the main buffer pool. Rearrange things so that data collected in CheckpointStats includes SLRU activity. Also remove the Shutdown{CLOG,CommitTS,SUBTRANS,MultiXact}() functions, because they were redundant after the shutdown checkpoint that immediately precedes them. (I'm not sure if they were ever needed, but they aren't now.) Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (parts) Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com> Discussion: https://postgr.es/m/CA+hUKGLJ=84YT+NvhkEEDAuUtVHMfQ9i-N7k_o50JmQ6Rpj_OQ@mail.gmail.com
2020-09-24Expose oldSnapshotControl definition via new header.Robert Haas
This makes it possible for code outside snapmgr.c to examine the contents of this data structure. This commit does not add any code which actually does so; a subsequent commit will make that change. Patch by me, reviewed by Thomas Munro, Dilip Kumar, Hamid Akhtar. Discussion: http://postgr.es/m/CA+TgmoY=aqf0zjTD+3dUWYkgMiNDegDLFjo+6ze=Wtpik+3XqA@mail.gmail.com
2020-09-23Improve behavior of tsearch_readline(), and remove t_readline().Tom Lane
Commit fbeb9da22, which added the tsearch_readline APIs, left t_readline() in place as a compatibility measure. But that function has been unused and deprecated for twelve years now, so that seems like enough time to remove it. Doing so, and merging t_readline's code into tsearch_readline, aids in making several useful improvements: * The hard-wired 4K limit on line length in tsearch data files is removed, by using a StringInfo buffer instead of a fixed-size buffer. * We can buy back the per-line palloc/pfree added by 3ea7e9550 in the common case where encoding conversion is not required. * We no longer need a separate pg_verify_mbstr call, as that functionality was folded into encoding conversion some time ago. (We could have done some of this stuff while keeping t_readline as a separate API, but there seems little point, since there's no reason for anyone to still be using t_readline directly.) Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
2020-09-23Improve error cursor positions for problems with partition bounds.Tom Lane
We failed to pass down the query string to check_new_partition_bound, so that its attempts to provide error cursor positions were for naught; one must have the query string to get parser_errposition to do anything. Adjust its API to require a ParseState to be passed down. Also, improve the logic inside check_new_partition_bound so that the cursor points at the partition bound for the specific column causing the issue, when one can be identified. That part is also for naught if we can't determine the query position of the column with the problem. Improve transformPartitionBoundValue so that it makes sure that const-simplified partition expressions will be properly labeled with positions. In passing, skip calling evaluate_expr if the value is already a Const, which is surely the most common case. Alexandra Wang, Ashwin Agrawal, Amit Langote; reviewed by Ashutosh Bapat Discussion: https://postgr.es/m/CACiyaSopZoqssfMzgHk6fAkp01cL6vnqBdmTw2C5_KJaFR_aMg@mail.gmail.com Discussion: https://postgr.es/m/CAJV4CdrZ5mKuaEsRSbLf2URQ3h6iMtKD=hik8MaF5WwdmC9uZw@mail.gmail.com
2020-09-22Rethink API for pg_get_line.c, one more time.Tom Lane
Further experience says that the appending behavior offered by pg_get_line_append is useful to only a very small minority of callers. For most, the requirement to reset the buffer after each line is just an error-prone nuisance. Hence, invent another alternative call pg_get_line_buf, which takes care of that detail. Noted while reviewing a patch from Daniel Gustafsson. Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
2020-09-19Remove large fill factor support from dynahash.c.Thomas Munro
Since ancient times we have had support for a fill factor (maximum load factor) to be set for a dynahash hash table, but: 1. It was an integer, whereas for in-memory hash tables interesting load factor targets are probably somewhere near the 0.75-1.0 range. 2. It was implemented in a way that performed an expensive division operation that regularly showed up in profiles. 3. We are not aware of anyone ever having used a non-default value. Therefore, remove support, effectively fixing it at 1. Author: Jakub Wartak <Jakub.Wartak@tomtom.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/VI1PR0701MB696044FC35013A96FECC7AC8F62D0%40VI1PR0701MB6960.eurprd07.prod.outlook.com
2020-09-18Allow most keywords to be used as column labels without requiring AS.Tom Lane
Up to now, if you tried to omit "AS" before a column label in a SELECT list, it would only work if the column label was an IDENT, that is not any known keyword. This is rather unfriendly considering that we have so many keywords and are constantly growing more. In the wake of commit 1ed6b8956 it's possible to improve matters quite a bit. We'd originally tried to make this work by having some of the existing keyword categories be allowed without AS, but that didn't work too well, because each category contains a few special cases that don't work without AS. Instead, invent an entirely orthogonal keyword property "can be bare column label", and mark all keywords that way for which we don't get shift/reduce errors by doing so. It turns out that of our 450 current keywords, all but 39 can be made bare column labels, improving the situation by over 90%. This number might move around a little depending on future grammar work, but it's a pretty nice improvement. Mark Dilger, based on work by myself and Robert Haas; review by John Naylor Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
2020-09-17Remove support for postfix (right-unary) operators.Tom Lane
This feature has been a thorn in our sides for a long time, causing many grammatical ambiguity problems. It doesn't seem worth the pain to continue to support it, so remove it. There are some follow-on improvements we can make in the grammar, but this commit only removes the bare minimum number of productions, plus assorted backend support code. Note that pg_dump and psql continue to have full support, since they may be used against older servers. However, pg_dump warns about postfix operators. There is also a check in pg_upgrade. Documentation-wise, I (tgl) largely removed the "left unary" terminology in favor of saying "prefix operator", which is a more standard and IMO less confusing term. I included a catversion bump, although no initial catalog data changes here, to mark the boundary at which oprkind = 'r' stopped being valid in pg_operator. Mark Dilger, based on work by myself and Robert Haas; review by John Naylor Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
2020-09-17Remove factorial operators, leaving only the factorial() function.Tom Lane
The "!" operator is our only built-in postfix operator. Remove it, on the way to removal of grammar support for postfix operators. There is also a "!!" prefix operator, but since it's been marked deprecated for most of its existence, we might as well remove it too. Also zap the SQL alias function numeric_fac(), which seems to have equally little reason to live. Mark Dilger, based on work by myself and Robert Haas; review by John Naylor Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
2020-09-17Improve common/logging.c's support for multiple verbosity levels.Tom Lane
Instead of hard-wiring specific verbosity levels into the option processing of client applications, invent pg_logging_increase_verbosity() and encourage clients to implement --verbose by calling that. Then, the common convention that more -v's gets you more verbosity just works. In particular, this allows resurrection of the debug-grade messages that have long existed in pg_dump and its siblings. They were unreachable before this commit due to lack of a way to select PG_LOG_DEBUG logging level. (It appears that they may have been unreachable for some time before common/logging.c was introduced, too, so I'm not specifically blaming cc8d41511 for the oversight. One reason for thinking that is that it's now apparent that _allocAH()'s message needs a null-pointer guard. Testing might have failed to reveal that before 96bf88d52.) Discussion: https://postgr.es/m/1173106.1600116625@sss.pgh.pa.us
2020-09-17Allow CURRENT_ROLE where CURRENT_USER is acceptedPeter Eisentraut
In the particular case of GRANTED BY, this is specified in the SQL standard. Since in PostgreSQL, CURRENT_ROLE is equivalent to CURRENT_USER, and CURRENT_USER is already supported here, adding CURRENT_ROLE is trivial. The other cases are PostgreSQL extensions, but for the same reason it also makes sense there. Reviewed-by: Vik Fearing <vik@postgresfriends.org> Reviewed-by: Asif Rehman <asifr.rehman@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/f2feac44-b4c5-f38f-3699-2851d6a76dc9%402ndquadrant.com
2020-09-17Add support for building GiST index by sorting.Heikki Linnakangas
This adds a new optional support function to the GiST access method: sortsupport. If it is defined, the GiST index is built by sorting all data to the order defined by the sortsupport's comparator function, and packing the tuples in that order to GiST pages. This is similar to how B-tree index build works, and is much faster than inserting the tuples one by one. The resulting index is smaller too, because the pages are packed more tightly, upto 'fillfactor'. The normal build method works by splitting pages, which tends to lead to more wasted space. The quality of the resulting index depends on how good the opclass-defined sort order is. A good order preserves locality of the input data. As the first user of this facility, add 'sortsupport' function to the point_ops opclass. It sorts the points in Z-order (aka Morton Code), by interleaving the bits of the X and Y coordinates. Author: Andrey Borodin Reviewed-by: Pavel Borisov, Thomas Munro Discussion: https://www.postgresql.org/message-id/1A36620E-CAD8-4267-9067-FB31385E7C0D%40yandex-team.ru
2020-09-16Fix bogus completion tag usage in walsenderAlvaro Herrera
Since commit fd5942c18f97 (2012, 9.3-era), walsender has been sending completion tags for certain replication commands twice -- and they're not even consistent. Apparently neither libpq nor JDBC have a problem with it, but it's not kosher. Fix by remove the EndCommand() call in the common code path for them all, and inserting specific calls to EndReplicationCommand() specifically in those places where it's needed. EndReplicationCommand() is a new simple function to send the completion tag for replication commands. Do this instead of sending a generic SELECT completion tag for them all, which was also pretty bogus (if innocuous). While at it, change StartReplication() to use EndReplicationCommand() instead of pg_puttextmessage(). In commit 2f9661311b83, I failed to realize that replication commands are not close-enough kin of regular SQL commands, so the DROP_REPLICATION_SLOT tag I added is undeserved and a type pun. Take it out. Backpatch to 13, where the latter commit appeared. The duplicate tag has been sent since 9.3, but since nothing is broken, it doesn't seem worth fixing. Per complaints from Tom Lane. Discussion: https://postgr.es/m/1347966.1600195735@sss.pgh.pa.us
2020-09-16Don't fetch partition check expression during InitResultRelInfo.Tom Lane
Since there is only one place that actually needs the partition check expression, namely ExecPartitionCheck, it's better to fetch it from the relcache there. In this way we will never fetch it at all if the query never has use for it, and we still fetch it just once when we do need it. The reason for taking an interest in this is that if the relcache doesn't already have the check expression cached, fetching it requires obtaining AccessShareLock on the partition root. That means that operations that look like they should only touch the partition itself will also take a lock on the root. In particular we observed that TRUNCATE on a partition may take a lock on the partition's root, contributing to a deadlock situation in parallel pg_restore. As written, this patch does have a small cost, which is that we are microscopically reducing efficiency for the case where a partition has an empty check expression. ExecPartitionCheck will be called, and will go through the motions of setting up and checking an empty qual, where before it would not have been called at all. We could avoid that by adding a separate boolean flag to track whether there is a partition expression to test. However, this case only arises for a default partition with no siblings, which surely is not an interesting case in practice. Hence adding complexity for it does not seem like a good trade-off. Amit Langote, per a suggestion by me Discussion: https://postgr.es/m/VI1PR03MB31670CA1BD9625C3A8C5DD05EB230@VI1PR03MB3167.eurprd03.prod.outlook.com
2020-09-16Fix bogus cache-invalidation logic in logical replication worker.Tom Lane
The code recorded cache invalidation events by zeroing the "localreloid" field of affected cache entries. However, it's possible for an inval event to occur even while we have the entry open and locked. So an ill-timed inval could result in "cache lookup failed for relation 0" errors, if the worker's code tried to use the cleared field. We can fix that by creating a separate bool field to record whether the entry needs to be revalidated. (In the back branches, cram the bool into what had been padding space, to avoid an ABI break in the somewhat unlikely event that any extension is looking at this struct.) Also, rearrange the logic in logicalrep_rel_open so that it does the right thing in cases where table_open would fail. We should retry the lookup by name in that case, but we didn't. The real-world impact of this is probably small. In the first place, the error conditions are very low probability, and in the second place, the worker would just exit and get restarted. We only noticed because in a CLOBBER_CACHE_ALWAYS build, the failure can occur repeatedly, preventing the worker from making progress. Nonetheless, it's clearly a bug, and it impedes a useful type of testing; so back-patch to v10 where this code was introduced. Discussion: https://postgr.es/m/1032727.1600096803@sss.pgh.pa.us
2020-09-12Remove unused function declaration in logicalproto.h.Amit Kapila
In the passing, fix a typo in pgoutput.c. Reported-by: Tomas Vondra Author: Tomas Vondra Reviewed-by: Dilip Kumar Discussion: https://postgr.es/m/20200909084353.pncuclpbwlr7vylh@development
2020-09-11logtape.c: do not preallocate for tapes when sortingJeff Davis
The preallocation logic is only useful for HashAgg, so disable it when sorting. Also, adjust an out-of-date comment. Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wzn_o7tE2+hRVvwSFghRb75AJ5g-nqGzDUqLYMexjOAe=g@mail.gmail.com Backpatch-through: 13
2020-09-11Don't run atexit callbacks during signal exits from ProcessStartupPacket.Tom Lane
Although 58c6feccf fixed the case for SIGQUIT, we were still calling proc_exit() from signal handlers for SIGTERM and timeout failures in ProcessStartupPacket. Fortunately, at the point where that code runs, we haven't yet connected to shared memory in any meaningful way, so there is nothing we need to undo in shared memory. This means it should be safe to use _exit(1) here, ie, not run any atexit handlers but also inform the postmaster that it's not a crash exit. To make sure nobody breaks the "nothing to undo" expectation, add a cross-check that no on-shmem-exit or before-shmem-exit handlers have been registered yet when we finish using these signal handlers. This change is simple enough that maybe it could be back-patched, but I won't risk that right now. Discussion: https://postgr.es/m/1850884.1599601164@sss.pgh.pa.us
2020-09-10Print WAL logical message contents in pg_waldumpAlvaro Herrera
This helps debuggability when looking at WAL streams containing logical messages. Author: Ashutosh Bapat <ashutosh.bapat@2ndquadrant.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/CAExHW5sWx49rKmXbg5H1Xc1t+nRv9PaYKQmgw82HPt6vWDVmDg@mail.gmail.com
2020-09-09Expose internal function for converting int64 to numericPeter Eisentraut
Existing callers had to take complicated detours via DirectFunctionCall1(). This simplifies a lot of code. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu
2020-09-08Add support for partitioned tables and indexes in REINDEXMichael Paquier
Until now, REINDEX was not able to work with partitioned tables and indexes, forcing users to reindex partitions one by one. This extends REINDEX INDEX and REINDEX TABLE so as they can accept a partitioned index and table in input, respectively, to reindex all the partitions assigned to them with physical storage (foreign tables, partitioned tables and indexes are then discarded). This shares some logic with schema and database REINDEX as each partition gets processed in its own transaction after building a list of relations to work on. This choice has the advantage to minimize the number of invalid indexes to one partition with REINDEX CONCURRENTLY in the event a cancellation or failure in-flight, as the only indexes handled at once in a single REINDEX CONCURRENTLY loop are the ones from the partition being working on. Isolation tests are added to emulate some cases I bumped into while developing this feature, particularly with the concurrent drop of a leaf partition reindexed. However, this is rather limited as LOCK would cause REINDEX to block in the first transaction building the list of partitions. Per its multi-transaction nature, this new flavor cannot run in a transaction block, similarly to REINDEX SCHEMA, SYSTEM and DATABASE. Author: Justin Pryzby, Michael Paquier Reviewed-by: Anastasia Lubennikova Discussion: https://postgr.es/m/db12e897-73ff-467e-94cb-4af03705435f.adger.lj@alibaba-inc.com
2020-09-07Add d_type to our Windows dirent emulation.Thomas Munro
This allows us to skip some stat calls, by extending commit 861c6e7c to cover Windows systems. Author: Juan José Santamaría Flecha <juanjo.santamaria@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Magnus Hagander <magnus@hagander.net> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKG%2BFzxupGGN4GpUdbzZN%2Btn6FQPHo8w0Q%2BAPH5Wz8RG%2Bww%40mail.gmail.com
2020-09-07Skip unnecessary stat() calls in walkdir().Thomas Munro
Some kernels can tell us the type of a "dirent", so we can avoid a call to stat() or lstat() in many cases. Define a new function get_dirent_type() to contain that logic, for use by the backend and frontend versions of walkdir(), and perhaps other callers in future. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Juan José Santamaría Flecha <juanjo.santamaria@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKG%2BFzxupGGN4GpUdbzZN%2Btn6FQPHo8w0Q%2BAPH5Wz8RG%2Bww%40mail.gmail.com
2020-09-06Refactor pg_get_line() to expose an alternative StringInfo-based API.Tom Lane
Letting the caller provide a StringInfo to read into is helpful when the caller needs to merge lines or otherwise modify the data after it's been read. Notably, now the code added by commit 8f8154a50 can use pg_get_line_append() instead of having its own copy of that logic. A follow-on commit will also make use of this. Also, since StringInfo buffers are a minimum of 1KB long, blindly using pg_get_line() in a loop can eat a lot more memory than one would expect. I discovered for instance that commit e0f05cd5b caused initdb to consume circa 10MB to read postgres.bki, even though that's under 1MB worth of data. A less memory-hungry alternative is to re-use the same StringInfo for all lines and pg_strdup the results. Discussion: https://postgr.es/m/1315832.1599345736@sss.pgh.pa.us
2020-09-05Use multi-inserts for pg_dependMichael Paquier
This is a follow-up of the work done in e3931d01. This case is a bit different than pg_attribute and pg_shdepend: the maximum number of items to insert is known in advance, but there is no need to handle pinned dependencies. Hence, the base allocation for slots is done based on the number of items and the maximum allowed with a cap at 64kB. Slots are initialized once used to minimize the overhead of the operation. The insertions can be done for dependencies of the same type. More could be done by grouping the insertion of multiple dependency types in a single batch. This is left as future work. Some of the multi-insert logic is also simplified for pg_shdepend, as per the feedback discussed for this specific patch. This also moves to indexing.h the variable capping the maximum amount of data that can be used at once for a multi-insert, instead of having separate definitions for pg_attribute, pg_depend and pg_shdepend. Author: Daniel Gustafsson, Michael Paquier Reviewed-by: Andres Freund, Álvaro Herrera Discussion: https://postgr.es/m/20200807061619.GA23955@paquier.xyz
2020-09-04C comment: correct use of 64-"byte" cache line sizeBruce Momjian
Reported-by: Kelly Min Discussion: https://postgr.es/m/CAPSbxatOiQO90LYpSC3+svAU9-sHgDfEP4oFhcEUt_X=DqFA9g@mail.gmail.com Backpatch-through: 9.5
2020-09-04Remove variable "concurrent" from ReindexStmtMichael Paquier
This node already handles multiple options using a bitmask, so having a separate boolean flag is not necessary. This simplifies the code a bit with less arguments to give to the reindex routines, by replacing the boolean with an equivalent bitmask value. Reviewed-by: Julien Rouhaud Discussion: https://postgr.es/m/20200902110326.GA14963@paquier.xyz
2020-09-03Remove arbitrary restrictions on password length.Tom Lane
This patch started out with the goal of harmonizing various arbitrary limits on password length, but after awhile a better idea emerged: let's just get rid of those fixed limits. recv_password_packet() has an arbitrary limit on the packet size, which we don't really need, so just drop it. (Note that this doesn't really affect anything for MD5 or SCRAM password verification, since those will hash the user's password to something shorter anyway. It does matter for auth methods that require a cleartext password.) Likewise remove the arbitrary error condition in pg_saslprep(). The remaining limits are mostly in client-side code that prompts for passwords. To improve those, refactor simple_prompt() so that it allocates its own result buffer that can be made as big as necessary. Actually, it proves best to make a separate routine pg_get_line() that has essentially the semantics of fgets(), except that it allocates a suitable result buffer and hence will never return a truncated line. (pg_get_line has a lot of potential applications to replace randomly-sized fgets buffers elsewhere, but I'll leave that for another patch.) I built pg_get_line() atop stringinfo.c, which requires moving that code to src/common/; but that seems fine since it was a poor fit for src/port/ anyway. This patch is mostly mine, but it owes a good deal to Nathan Bossart who pressed for a solution to the password length problem and created a predecessor patch. Also thanks to Peter Eisentraut and Stephen Frost for ideas and discussion. Discussion: https://postgr.es/m/09512C4F-8CB9-4021-B455-EF4C4F0D55A0@amazon.com
2020-09-03Add support for streaming to built-in logical replication.Amit Kapila
To add support for streaming of in-progress transactions into the built-in logical replication, we need to do three things: * Extend the logical replication protocol, so identify in-progress transactions, and allow adding additional bits of information (e.g. XID of subtransactions). * Modify the output plugin (pgoutput) to implement the new stream API callbacks, by leveraging the extended replication protocol. * Modify the replication apply worker, to properly handle streamed in-progress transaction by spilling the data to disk and then replaying them on commit. We however must explicitly disable streaming replication during replication slot creation, even if the plugin supports it. We don't need to replicate the changes accumulated during this phase, and moreover we don't have a replication connection open so we don't have where to send the data anyway. Author: Tomas Vondra, Dilip Kumar and Amit Kapila Reviewed-by: Amit Kapila, Kuntal Ghosh and Ajin Cherian Tested-by: Neha Sharma, Mahendra Singh Thalor and Ajin Cherian Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
2020-09-02Add string_to_table() function.Tom Lane
This splits a string at occurrences of a delimiter. It is exactly like string_to_array() except for producing a set of values instead of an array of values. Thus, the relationship of these two functions is the same as between regexp_split_to_table() and regexp_split_to_array(). Although the same results could be had from unnest(string_to_array()), this is somewhat faster than that, and anyway it seems reasonable to have it for symmetry with the regexp functions. Pavel Stehule, reviewed by Peter Smith Discussion: https://postgr.es/m/CAFj8pRD8HOpjq2TqeTBhSo_QkzjLOhXzGCpKJ4nCs7Y9SQkuPw@mail.gmail.com
2020-09-02Fix thinko with definition of REINDEXOPT_MISSING_OKMichael Paquier
This had no direct consequences, but let's be consistent and it would be confusing when adding new flags. Oversight in 1d65416. Reported-by: Justin Pryzby Discussion: https://postgr.es/m/20200902024148.GB20149@telsasoft.com
2020-09-02Improve handling of dropped relations for REINDEX DATABASE/SCHEMA/SYSTEMMichael Paquier
When multiple relations are reindexed, a scan of pg_class is done first to build the list of relations to work on. However the REINDEX logic has never checked if a relation listed still exists when beginning the work on it, causing for example sudden cache lookup failures. This commit adds safeguards against dropped relations for REINDEX, similarly to VACUUM or CLUSTER where we try to open the relation, ignoring it if it is missing. A new option is added to the REINDEX routines to control if a missed relation is OK to ignore or not. An isolation test, based on REINDEX SCHEMA, is added for the concurrent and non-concurrent cases. Author: Michael Paquier Reviewed-by: Anastasia Lubennikova Discussion: https://postgr.es/m/20200813043805.GE11663@paquier.xyz
2020-09-01Set cutoff xmin more aggressively when vacuuming a temporary table.Tom Lane
Since other sessions aren't allowed to look into a temporary table of our own session, we do not need to worry about the global xmin horizon when setting the vacuum XID cutoff. Indeed, if we're not inside a transaction block, we may set oldestXmin to be the next XID, because there cannot be any in-doubt tuples in a temp table, nor any tuples that are dead but still visible to some snapshot of our transaction. (VACUUM, of course, is never inside a transaction block; but we need to test that because CLUSTER shares the same code.) This approach allows us to always clean out a temp table completely during VACUUM, independently of concurrent activity. Aside from being useful in its own right, that simplifies building reproducible test cases. Discussion: https://postgr.es/m/3490536.1598629609@sss.pgh.pa.us
2020-08-31C comment: remove mention of use of t_hoff WAL structure memberBruce Momjian
Reported-by: Antonin Houska Discussion: https://postgr.es/m/21643.1595353537@antos Backpatch-through: 9.5
2020-08-30Mark factorial operator, and postfix operators in general, as deprecated.Tom Lane
Per discussion, we're planning to remove parser support for postfix operators in order to simplify the grammar. So it behooves us to put out a deprecation notice at least one release before that. There is only one built-in postfix operator, ! for factorial. Label it deprecated in the docs and in pg_description, and adjust some examples that formerly relied on it. (The sister prefix operator !! is also deprecated. We don't really have to remove that one, but since we're suggesting that people use factorial() instead, it seems better to remove both operators.) Also state in the CREATE OPERATOR ref page that postfix operators in general are going away. Although this changes the initial contents of pg_description, I did not force a catversion bump; it doesn't seem essential. In v13, also back-patch 4c5cf5431, so that there's someplace for the <link>s to point to. Mark Dilger and John Naylor, with some adjustments by me Discussion: https://postgr.es/m/BE2DF53D-251A-4E26-972F-930E523580E9@enterprisedb.com
2020-08-30Redefine pg_class.reltuples to be -1 before the first VACUUM or ANALYZE.Tom Lane
Historically, we've considered the state with relpages and reltuples both zero as indicating that we do not know the table's tuple density. This is problematic because it's impossible to distinguish "never yet vacuumed" from "vacuumed and seen to be empty". In particular, a user cannot use VACUUM or ANALYZE to override the planner's normal heuristic that an empty table should not be believed to be empty because it is probably about to get populated. That heuristic is a good safety measure, so I don't care to abandon it, but there should be a way to override it if the table is indeed intended to stay empty. Hence, represent the initial state of ignorance by setting reltuples to -1 (relpages is still set to zero), and apply the minimum-ten-pages heuristic only when reltuples is still -1. If the table is empty, VACUUM or ANALYZE (but not CREATE INDEX) will override that to reltuples = relpages = 0, and then we'll plan on that basis. This requires a bunch of fiddly little changes, but we can get rid of some ugly kluges that were formerly needed to maintain the old definition. One notable point is that FDWs' GetForeignRelSize methods will see baserel->tuples = -1 when no ANALYZE has been done on the foreign table. That seems like a net improvement, since those methods were formerly also in the dark about what baserel->tuples = 0 really meant. Still, it is an API change. I bumped catversion because code predating this change would get confused by seeing reltuples = -1. Discussion: https://postgr.es/m/F02298E0-6EF4-49A1-BCB6-C484794D9ACC@thebuild.com
2020-08-26Add additional information in the vacuum error context.Amit Kapila
The additional information added will be an offset number for heap operations. This information will help us in finding the exact tuple due to which the error has occurred. Author: Mahendra Singh Thalor and Amit Kapila Reviewed-by: Sawada Masahiko, Justin Pryzby and Amit Kapila Discussion: https://postgr.es/m/CAKYtNApK488TDF4bMbw+1QH8HJf9cxdNDXquhU50TK5iv_FtCQ@mail.gmail.com
2020-08-26Extend the BufFile interface.Amit Kapila
Allow BufFile to support temporary files that can be used by the single backend when the corresponding files need to be survived across the transaction and need to be opened and closed multiple times. Such files need to be created as a member of a SharedFileSet. Additionally, this commit implements the interface for BufFileTruncate to allow files to be truncated up to a particular offset and extends the BufFileSeek API to support the SEEK_END case. This also adds an option to provide a mode while opening the shared BufFiles instead of always opening in read-only mode. These enhancements in BufFile interface are required for the upcoming patch to allow the replication apply worker, to handle streamed in-progress transactions. Author: Dilip Kumar, Amit Kapila Reviewed-by: Amit Kapila Tested-by: Neha Sharma Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
2020-08-26Prevent non-superusers from reading pg_backend_memory_contexts, by default.Fujii Masao
pg_backend_memory_contexts view contains some internal information of memory contexts. Since exposing them to any users by default may cause security issue, this commit allows only superusers to read this view, by default, like we do for pg_shmem_allocations view. Bump catalog version. Author: Atsushi Torikoshi Reviewed-by: Michael Paquier, Fujii Masao Discussion: https://postgr.es/m/1414992.1597849297@sss.pgh.pa.us
2020-08-21Fix handling of CREATE TABLE LIKE with inheritance.Tom Lane
If a CREATE TABLE command uses both LIKE and traditional inheritance, Vars in CHECK constraints and expression indexes that are absorbed from a LIKE parent table tended to get mis-numbered, resulting in wrong answers and/or bizarre error messages (though probably not any actual crashes, thanks to validation occurring in the executor). In v12 and up, the same could happen to Vars in GENERATED expressions, even in cases with no LIKE clause but multiple traditional-inheritance parents. The cause of the problem for LIKE is that parse_utilcmd.c supposed it could renumber such Vars correctly during transformCreateStmt(), which it cannot since we have not yet accounted for columns added via inheritance. Fix that by postponing processing of LIKE INCLUDING CONSTRAINTS, DEFAULTS, GENERATED, INDEXES till after we've performed DefineRelation(). The error with GENERATED and multiple inheritance is a simple oversight in MergeAttributes(); it knows it has to renumber Vars in inherited CHECK constraints, but forgot to apply the same processing to inherited GENERATED expressions (a/k/a defaults). Per bug #16272 from Tom Gottfried. The non-GENERATED variants of the issue are ancient, presumably dating right back to the addition of CREATE TABLE LIKE; hence back-patch to all supported branches. Discussion: https://postgr.es/m/16272-6e32da020e9a9381@postgresql.org
2020-08-21Fix a few typos in JIT comments and READMEDavid Rowley
Reviewed-by: Abhijit Menon-Sen Reviewed-by: Andres Freund Discussion: https://postgr.es/m/CAApHDvobgmCs6CohqhKTUf7D8vffoZXQTCBTERo9gbOeZmvLTw%40mail.gmail.com Backpatch-through: 11, where JIT was added
2020-08-19Suppress unnecessary RelabelType nodes in yet more cases.Tom Lane
Commit a477bfc1d fixed eval_const_expressions() to ensure that it didn't generate unnecessary RelabelType nodes, but I failed to notice that some other places in the planner had the same issue. Really noplace in the planner should be using plain makeRelabelType(), for fear of generating expressions that should be equal() to semantically equivalent trees, but aren't. An example is that because canonicalize_ec_expression() failed to be careful about this, we could end up with an equivalence class containing both a plain Const, and a Const-with-RelabelType representing exactly the same value. So far as I can tell this led to no visible misbehavior, but we did waste a bunch of cycles generating and evaluating "Const = Const-with-RelabelType" to prove such entries are redundant. Hence, move the support function added by a477bfc1d to where it can be more generally useful, and use it in the places where planner code previously used makeRelabelType. Back-patch to v12, like the previous patch. While I have no concrete evidence of any real misbehavior here, it's certainly possible that I overlooked a case where equivalent expressions that aren't equal() could cause a user-visible problem. In any case carrying extra RelabelType nodes through planning to execution isn't very desirable. Discussion: https://postgr.es/m/1311836.1597781384@sss.pgh.pa.us
2020-08-19Add pg_backend_memory_contexts system view.Fujii Masao
This view displays the usages of all the memory contexts of the server process attached to the current session. This information is useful to investigate the cause of backend-local memory bloat. This information can be also collected by calling MemoryContextStats(TopMemoryContext) via a debugger. But this technique cannot be uesd in some environments because no debugger is available there. And it outputs lots of text messages and it's not easy to analyze them. So, pg_backend_memory_contexts view allows us to access to backend-local memory contexts information more easily. Bump catalog version. Author: Atsushi Torikoshi, Fujii Masao Reviewed-by: Tatsuhito Kasahara, Andres Freund, Daniel Gustafsson, Robert Haas, Michael Paquier Discussion: https://postgr.es/m/72a656e0f71d0860161e0b3f67e4d771@oss.nttdata.com