summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2018-06-27Cosmetic improvements for faster column addition.Amit Kapila
Changed the name of few structure members for the sake of clarity and removed spurious whitespace. Reported-by: Amit Kapila Author: Amit Kapila, based on suggestion by Andrew Dunstan Reviewed-by: Alvaro Herrera Discussion: https://postgr.es/m/CAA4eK1K2znsFpC+NQ9A4vxT4uDxADN4RmvHX0L6Y=aHVo9gB4Q@mail.gmail.com
2018-06-26Fix "base" snapshot handling in logical decodingAlvaro Herrera
Two closely related bugs are fixed. First, xmin of logical slots was advanced too early. During xl_running_xacts processing, xmin of the slot was set to the oldest running xid in the record, but that's wrong: actually, snapshots which will be used for not-yet-replayed transactions might consider older txns as running too, so we need to keep xmin back for them. The problem wasn't noticed earlier because DDL which allows to delete tuple (set xmax) while some another not-yet-committed transaction looks at it is pretty rare, if not unique: e.g. all forms of ALTER TABLE which change schema acquire ACCESS EXCLUSIVE lock conflicting with any inserts. The included test case (test_decoding's oldest_xmin) uses ALTER of a composite type, which doesn't have such interlocking. To deal with this, we must be able to quickly retrieve oldest xmin (oldest running xid among all assigned snapshots) from ReorderBuffer. To fix, add another list of ReorderBufferTXNs to the reorderbuffer, where transactions are sorted by base-snapshot-LSN. This is slightly different from the existing (sorted by first-LSN) list, because a transaction can have an earlier LSN but a later Xmin, if its first record does not obtain an xmin (eg. xl_xact_assignment). Note this new list doesn't fully replace the existing txn list: we still need that one to prevent WAL recycling. The second issue concerns SnapBuilder snapshots and subtransactions. SnapBuildDistributeNewCatalogSnapshot never assigned a snapshot to a transaction that is known to be a subtxn, which is good in the common case that the top-level transaction already has one (no point in doing so), but a bug otherwise. To fix, arrange to transfer the snapshot from the subtxn to its top-level txn as soon as the kinship gets known. test_decoding's snapshot_transfer verifies this. Also, fix a minor memory leak: refcount of toplevel's old base snapshot was not decremented when the snapshot is transferred from child. Liberally sprinkle code comments, and rewrite a few existing ones. This part is my (Álvaro's) contribution to this commit, as I had to write all those comments in order to understand the existing code and Arseny's patch. Reported-by: Arseny Sher <a.sher@postgrespro.ru> Diagnosed-by: Arseny Sher <a.sher@postgrespro.ru> Co-authored-by: Arseny Sher <a.sher@postgrespro.ru> Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Antonin Houska <ah@cybertec.at> Discussion: https://postgr.es/m/87lgdyz1wj.fsf@ars-thinkpad
2018-06-26Fix upper limit for vacuum_cleanup_index_scale_factorAlexander Korotkov
6ca33a88 sets upper limit for vacuum_cleanup_index_scale_factor to DBL_MAX. DBL_MAX appears to be platform-dependent. That causes many buildfarm animals to fail, because we check boundaries of vacuum_cleanup_index_scale_factor in regression tests. This commit changes upper limit from DBL_MAX to just "large enough" limit, which was arbitrary selected as 1e10. Author: Alexander Korotkov Reported-by: Tom Lane, Darafei Praliaskouski Discussion: https://postgr.es/m/CAPpHfdvewmr4PcpRjrkstoNn1n2_6dL-iHRB21CCfZ0efZdBTg%40mail.gmail.com Discussion: https://postgr.es/m/CAC8Q8tLYFOpKNaPS_E7V8KtPdE%3D_TnAn16t%3DA3LuL%3DXjfOO-BQ%40mail.gmail.com
2018-06-26Correct a comment on logtape.c's leader tape.Peter Geoghegan
randomAccess parallel tuplesorts are disallowed because the leader would try to write to its own leader tape, not because the leader would try to write to a worker tape directly. Cleanup from commit 9da0cc35284.
2018-06-26Remove obsolete comment block in nbtsort.c.Peter Geoghegan
Building a new nbtree index through incremental insertions would always be slower than our actual approach of sorting using tuplesort, assembling leaf pages from tuplesort output, and writing and WAL-logging whole pages. Remove a comment block from the Berkeley days claiming that incremental insertions might be slightly faster with presorted input. Discussion: https://postgr.es/m/CAH2-WzmKs4mLAoFgJ3yHMRYc849efc=dw+pNRb3NEog2oJoCNw@mail.gmail.com
2018-06-26Enable failure to rename a partitioned indexAlvaro Herrera
Concurrently with partitioned index development (commit 8b08f7d4820f), the code to handle failure to rename indexes was refactored (commit 8b9e9644dc6a). Turns out that that particular case was untested, which naturally led it to be broken. Add tests and the missing code line. Co-authored-by: David Rowley <dgrowley@gmail.com> Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reported-by: Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com> Discussion: https://postgr.es/m/CAKcux6mfYMS3OX0ywjOiWiGSEKhJf-1zdeTceHFbd0mScUzU5A@mail.gmail.com
2018-06-26Allow direct lookups of AppendRelInfo by child relidAlvaro Herrera
find_appinfos_by_relids had quite a large overhead when the number of items in the append_rel_list was high, as it had to trawl through the append_rel_list looking for AppendRelInfos belonging to the given childrelids. Since there can only be a single AppendRelInfo for each child rel, it seems much better to store an array in PlannerInfo which indexes these by child relid, making the function O(1) rather than O(N). This function was only called once inside the planner, so just replace that call with a lookup to the new array. find_childrel_appendrelinfo is now unused and thus removed. This fixes a planner performance regression new to v11 reported by Thomas Reiss. Author: David Rowley Reported-by: Thomas Reiss Reviewed-by: Ashutosh Bapat Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/94dd7a4b-5e50-0712-911d-2278e055c622@dalibo.com
2018-06-26Increase upper limit for vacuum_cleanup_index_scale_factorAlexander Korotkov
Upper limits for vacuum_cleanup_index_scale_factor GUC and reloption were initially set to 100.0 in 857f9c36. However, after further discussion, it appears that some users like to disable B-tree cleanup index scan completely (assuming there are no deleted pages). vacuum_cleanup_index_scale_factor is used barely to protect against stalled index statistics. And after detailed consideration it appears that risk of stalled index statistics is low. And it would be nice to allow advanced users setting higher values of vacuum_cleanup_index_scale_factor. So, set upper limit for these GUC and reloption to DBL_MAX. Author: Alexander Korotkov Reviewed-by: Masahiko Sawada Discussion: https://postgr.es/m/CAC8Q8tJCb%3DgxhzcV7T6ctx7PY-Ux1oA-AsTJc6cAVNsQiYcCzA%40mail.gmail.com
2018-06-26Reword SPI_ERROR_TRANSACTION errors in PL/pgSQLPeter Eisentraut
The previous message for SPI_ERROR_TRANSACTION claimed "cannot begin/end transactions in PL/pgSQL", but that is no longer true. Nevertheless, the error can still happen, so reword the messages. The error cases in exec_prepare_plan() could never happen, so remove them.
2018-06-26Move RecoveryLockList into a hash table.Thomas Munro
Standbys frequently need to release all locks held by a given xid. Instead of searching one big list linearly, let's create one list per xid and put them in a hash table, so we can find what we need in O(1) time. Earlier analysis and a prototype were done by David Rowley, though this isn't his patch. Back-patch all the way. Author: Thomas Munro Diagnosed-by: David Rowley, Andres Freund Reviewed-by: Andres Freund, Tom Lane, Robert Haas Discussion: https://postgr.es/m/CAEepm%3D1mL0KiQ2KJ4yuPpLGX94a4Ns_W6TL4EGRouxWibu56pA%40mail.gmail.com Discussion: https://postgr.es/m/CAKJS1f9vJ841HY%3DwonnLVbfkTWGYWdPN72VMxnArcGCjF3SywA%40mail.gmail.com
2018-06-26Fix description and documentation related to pg_restore --no-commentsMichael Paquier
These descriptions have been referring to object dump, but a restore operation is done. Reported-by: Andrey Lizenko Author: Andrey Lizenko Discussion: https://postgr.es/m/152992021588.1268.16786093506650391435@wrigleys.postgresql.org
2018-06-26Correct handling of fsync failures with tar mode of walmethods.cMichael Paquier
This file has been missing the fact that it needs to report back to callers a proper failure on fsync calls. I have spotted the one in tar_finish() while Kuntal has spotted the one in tar_close(). Backpatch down to 10 where this code has been introduced. Reported by: Michael Paquier, Kuntal Ghosh Author: Michael Paquier Reviewed-by: Kuntal Ghosh, Magnus Hagander Discussion: https://postgr.es/m/20180625024356.GD1146@paquier.xyz
2018-06-25Update obsolete commentsAlvaro Herrera
Commit 9fab40ad32ef removed some pre-allocating logic in reorderbuffer.c, but left outdated comments in place. Repair. Author: Álvaro Herrera
2018-06-25Stamp 11beta2.REL_11_BETA2Alvaro Herrera
2018-06-25Translation updatesPeter Eisentraut
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 884f33d735870f94357820800840af3e93ff4628
2018-06-25Address set of issues with errno handlingMichael Paquier
System calls mixed up in error code paths are causing two issues which several code paths have not correctly handled: 1) For write() calls, sometimes the system may return less bytes than what has been written without errno being set. Some paths were careful enough to consider that case, and assumed that errno should be set to ENOSPC, other calls missed that. 2) errno generated by a system call is overwritten by other system calls which may succeed once an error code path is taken, causing what is reported to the user to be incorrect. This patch uses the brute-force approach of correcting all those code paths. Some refactoring could happen in the future, but this is let as future work, which is not targeted for back-branches anyway. Author: Michael Paquier Reviewed-by: Ashutosh Sharma Discussion: https://postgr.es/m/20180622061535.GD5215@paquier.xyz
2018-06-23Mark binary_upgrade_set_missing_value as parallel_unsafeAndrew Dunstan
per buildfarm. Bump catalog version again although in practice nobody is going to use this in a parallel query.
2018-06-22When index recurses to a partition, map columns numbersAlvaro Herrera
Two out of three code paths were mapping column numbers correctly if a partition had different column numbers than parent table, but the most commonly used one (recursing in CREATE INDEX to a new index on a partition) failed to map attribute numbers in expressions. Oddly enough, attnums in WHERE clauses are already handled correctly everywhere. Reported-by: Amit Langote Author: Amit Langote Discussion: https://postgr.es/m/dce1fda4-e0f0-94c9-6abb-f5956a98c057@lab.ntt.co.jp Reviewed-by: Álvaro Herrera
2018-06-22Avoid generating bogus paths with partitionwise aggregate.Robert Haas
Previously, if some or all partitions had no partially aggregated path, we would still try to generate a partially aggregated path for the parent, leading to assertion failures or wrong answers. Report by Rajkumar Raghuwanshi. Patch by Jeevan Chalke, reviewed by Ashutosh Bapat. A few changes by me. Discussion: http://postgr.es/m/CAKcux6=q4+Mw8gOOX16ef6ZMFp9Cve7KWFstUsrDa4GiFaXGUQ@mail.gmail.com
2018-06-22Allow for pg_upgrade of attributes with missing valuesAndrew Dunstan
Commit 16828d5c02 neglected to do this, so upgraded databases would silently get null instead of the specified default in rows without the attribute defined. A new binary upgrade function is provided to perform this and pg_dump is adjusted to output a call to the function if required in binary upgrade mode. Also included is code to drop missing attribute values for dropped columns. That way if the type is later dropped the missing value won't have a dangling reference to the type. Finally the regression tests are adjusted to ensure that there is a row with a missing value so that this code is exercised in upgrade testing. Catalog version unfortunately bumped. Regression test changes from Tom Lane. Remainder from me, reviewed by Tom Lane, Andres Freund, Alvaro Herrera Discussion: https://postgr.es/m/19987.1529420110@sss.pgh.pa.us
2018-06-22Fixes for vacuum_cleanup_index_scale_factor GUC optionAlexander Korotkov
vacuum_cleanup_index_scale_factor was located in autovacuum group of GUCs. However, it affects not only autovacuum, but also manually run VACUUM. It appears that "client connection defaults" group of GUCs is more appropriate for vacuum_cleanup_index_scale_factor, because vacuum_*_age options are already located there. Also, vacuum_cleanup_index_scale_factor was missed in postgresql.conf.sample. So, add it there with appropriate comment. Author: Masahiko Sawada with minor editorization by me Discussion: https://postgr.es/m/CAD21AoArsoXMLKudXSKN679FRzs6oubEchM53bHwn8Tp%3D2boNg%40mail.gmail.com
2018-06-22Fix typo in comment of commit_ts.c for incorrect reference to CLOGMichael Paquier
Author: Shao Bret
2018-06-22Improve coding pattern in Parallel Append code.Amit Kapila
The create_append_path code didn't consider that list_concat will modify it's first argument leading to inconsistent traversal of resulting list. In practice, it won't lead to any user-visible bug but changing it for making the code behave consistently. Reported-by: Tom Lane Author: Tom Lane Reviewed-by: Amit Khandekar and Amit Kapila Discussion: https://postgr.es/m/32365.1528994120@sss.pgh.pa.us
2018-06-21Fix partial aggregation for variance(int4) and related aggregates.Tom Lane
A typo in numeric_poly_combine caused bogus results for queries using it, but of course would only manifest if parallel aggregation is performed. Reported by Rajkumar Raghuwanshi. David Rowley did the diagnosis and the fix; I editorialized rather heavily on his regression test additions. Back-patch to v10 where the breakage was introduced (by 9cca11c91). Discussion: https://postgr.es/m/CAKcux6nU4E2x8nkSBpLOT2DPvQ5LviJ3SGyAN6Sz7qDH4G4+Pw@mail.gmail.com
2018-06-21Set correct context for XPath evaluationAlvaro Herrera
According to the SQL standard, the context of XMLTABLE's XPath row_expression is the document node of the XML input document, not the root node. This becomes visible when a relative path rather than absolute is used as row expression. Absolute paths is what was used in original tests and docs (and the most common form used in examples throughout the interwebs), which explains why this wasn't noticed before. Other functions such as xpath() and xpath_exists() also have this problem. While not specified by the SQL standard, it would be pretty odd to leave those functions to behave differently than XMLTABLE, so change them too. However, this is a backwards-incompatible change. No backpatch, out of fear of breaking code depending on the original broken behavior. Author: Markus Winand Reported-By: Markus Winand Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/0684A598-002C-42A2-AE12-F024A324EAE4@winand.at
2018-06-21Improve requirements documentation for ldap test suite.Tom Lane
Text by me; data contributed by me, Thomas Munro, Michael Paquier. Discussion: https://postgr.es/m/20180521013425.GA4476@paquier.xyz
2018-06-21Fix mishandling of sortgroupref labels while splitting SRF targetlists.Tom Lane
split_pathtarget_at_srfs() neglected to worry about sortgroupref labels in the intermediate PathTargets it constructs. I think we'd supposed that their labeling didn't matter, but it does at least for the case that GroupAggregate/GatherMerge nodes appear immediately under the ProjectSet step(s). This results in "ERROR: ORDER/GROUP BY expression not found in targetlist" during create_plan(), as reported by Rajkumar Raghuwanshi. To fix, make this logic track the sortgroupref labeling of expressions, not just their contents. This also restores the pre-v10 behavior that separate GROUP BY expressions will be kept distinct even if they are textually equal(). Discussion: https://postgr.es/m/CAKcux6=1_Ye9kx8YLBPmJs_xE72PPc6vNi5q2AOHowMaCWjJ2w@mail.gmail.com
2018-06-20Update expected XML output with disabled XMLAlvaro Herrera
Should have been done in previous commit.
2018-06-20Accept TEXT and CDATA nodes in XMLTABLE's column_expression.Alvaro Herrera
Column expressions that match TEXT or CDATA nodes must return the contents of the nodes themselves, not the content of non-existing children (i.e. the empty string). Author: Markus Winand Reported-by: Markus Winand Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/0684A598-002C-42A2-AE12-F024A324EAE4@winand.at
2018-06-20Add missing includeMagnus Hagander
Per buildfarm
2018-06-20Consistently use the term 'partitioned rel' in partprune commentsAlvaro Herrera
We were using 'partition rel' in a few places, which is quite confusing. Author: Amit Langote Reviewed-by: David Rowley Reviewed-by: Michaël Paquier Discussion: https://postgr.es/m/fd256561-31a2-4b7e-cd84-d8241e7ebc3f@lab.ntt.co.jp
2018-06-20Support long option for --pgdata in pg_verify_checksumsMagnus Hagander
Author: Daniel Gustafsson <daniel@yesql.se>
2018-06-20Don't consider parallel append for parallel unsafe paths.Amit Kapila
Commit ab72716778 allowed Parallel Append paths to be generated for a relation that is not parallel safe. Prevent that from happening. Initial analysis by Tom Lane. Reported-by: Rajkumar Raghuwanshi Author: Amit Kapila and Rajkumar Raghuwanshi Reviewed-by: Amit Khandekar and Robert Haas Discussion:https://postgr.es/m/CAKcux6=tPJ6nJ08r__nU_pmLQiC0xY15Fn0HvG1Cprsjdd9s_Q@mail.gmail.com
2018-06-20Clarify use of temporary tables within partition treesMichael Paquier
Since their introduction, partition trees have been a bit lossy regarding temporary relations. Inheritance trees respect the following patterns: 1) a child relation can be temporary if the parent is permanent. 2) a child relation can be temporary if the parent is temporary. 3) a child relation cannot be permanent if the parent is temporary. 4) The use of temporary relations also imply that when both parent and child need to be from the same sessions. Partitions share many similar patterns with inheritance, however the handling of the partition bounds make the situation a bit tricky for case 1) as the partition code bases a lot of its lookup code upon PartitionDesc which does not really look after relpersistence. This causes for example a temporary partition created by session A to be visible by another session B, preventing this session B to create an extra partition which overlaps with the temporary one created by A with a non-intuitive error message. There could be use-cases where mixing permanent partitioned tables with temporary partitions make sense, but that would be a new feature. Partitions respect 2), 3) and 4) already. It is a bit depressing to see those error checks happening in MergeAttributes() whose purpose is different, but that's left as future refactoring work. Back-patch down to 10, which is where partitioning has been introduced, except that default partitions do not apply there. Documentation also includes limitations related to the use of temporary tables with partition trees. Reported-by: David Rowley Author: Amit Langote, Michael Paquier Reviewed-by: Ashutosh Bapat, Amit Langote, Michael Paquier Discussion: https://postgr.es/m/CAKJS1f94Ojk0og9GMkRHGt8wHTW=ijq5KzJKuoBoqWLwSVwGmw@mail.gmail.com
2018-06-19Clarify the README files for the various separate TAP-based test suites.Tom Lane
Explain the difference between "make check" and "make installcheck". Mention the need for --enable-tap-tests (only some of these did so before). Standardize their wording about how to run the tests.
2018-06-19README: add URLs for openldap installationBruce Momjian
Reported-by: Michael Paquier Discussion: https://postgr.es/m/20180521013425.GA4476@paquier.xyz Backpatch-through: head
2018-06-19Track new configure flags introduced for version 11 in pg_config.h.win32Michael Paquier
The following set of flags mainly matter when building Postgres code with MSVC and those have been forgotten with latest developments: - HAVE_LDAP_INITIALIZE, added by 35c0754f, and marked as disabled. ldap_initialize() is a non-standard extension that provides a way to use "ldaps" with OpenLDAP, but it is not supported on Windows, and instead the non-standard ldap_sslinit() is used if WIN32 is defined. Per input from Thomas Munro. - HAVE_X509_GET_SIGNATURE_NID, added by 054e8c6c, which is used by SCRAM's channel binding tls-server-end-point. Having this flag disabled would cause this channel binding type to be unsupported for Windows builds. - HAVE_SSL_CLEAR_OPTIONS, added recently as of a364dfa4 to disable SSL compression. - HAVE_ASN1_STRING_GET0_DATA, added by 5c6df67, which is used to track a new compatibility with OpenSSL 1.1.0. This was missing from pg_config.win32.h and is not enabled by default. HAVE_BIO_GET_DATA, HAVE_OPENSSL_INIT_SSL and HAVE_BIO_METH_NEW gain the same treatment. The second and third flags are enabled with this commit, which raises the bar of OpenSSL support to 1.0.2 on Windows as a minimum. As this is the LTS (long-time support) version of OpenSSL community and knowing that all recent installers referred by OpenSSL upstream don't have anymore 1.0.1 or older, we could live with that requirement. In order to allow the code to compile with OpenSSL 1.1.0, all the flags mentioned above need to be enabled in pg_config.h.win32. Author: Michael Paquier Reviewed-by: Andrew Dunstan Discussion: https://postgr.es/m/20180529211559.GF6632@paquier.xyz
2018-06-18Allow plperl_sv_to_datum to look through scalar refs.Tom Lane
There seems little reason for the policy of throwing error if we find a ref to something other than a hash or array. Recursively look through the ref, instead. This makes the behavior in non-transform cases comparable to what was already instantiated for jsonb_plperl. Note that because we invoke any available transform function before considering the ref case, it's up to each transform function whether it wants to play along with this behavior or do something different. Because the previous behavior was just to throw a useless error, this seems unlikely to create any compatibility issues. Still, given the lack of field complaints so far, seems best not to back-patch. Discussion: https://postgr.es/m/28336.1528393969@sss.pgh.pa.us
2018-06-18Remove obsolete prohibition on function name matching a column name.Tom Lane
ProcedureCreate formerly threw an error if the function to be created has one argument of composite type and the function name matches some column of the composite type. This was a (very non-bulletproof) defense against creating situations where f(x) and x.f are ambiguous. But we don't really need such a defense in the wake of commit b97a3465d, which allows us to deal with such situations fairly cleanly. This behavior also created a dump-and-reload hazard, since a function might be rejected if a conflicting column name had been added to the input composite type later. Hence, let's just drop the check. Discussion: https://postgr.es/m/CAOW5sYa3Wp7KozCuzjOdw6PiOYPi6D=VvRybtH2S=2C0SVmRmA@mail.gmail.com
2018-06-18Consider syntactic form when disambiguating function vs column reference.Tom Lane
Postgres has traditionally considered the syntactic forms f(x) and x.f to be equivalent, allowing tricks such as writing a function and then using it as though it were a computed-on-demand column. However, our behavior when both interpretations are feasible left something to be desired: we always chose the column interpretation. This could lead to very surprising results, as in a recent bug report from Neil Conway. It also created a dump-and-reload hazard, since what was a function call in a dumped view could get interpreted as a column reference at reload, if a matching column name had been added to the underlying table since the view was created. What seems better, in ambiguous situations, is to prefer the choice matching the syntactic form of the reference. This seems much less astonishing in general, and it fixes the dump/reload hazard. Although this could be called a bug fix, there have been few complaints and there's some small risk of breaking applications that depend on the old behavior, so no back-patch. It does seem reasonable to slip it into v11, though. Discussion: https://postgr.es/m/CAOW5sYa3Wp7KozCuzjOdw6PiOYPi6D=VvRybtH2S=2C0SVmRmA@mail.gmail.com
2018-06-18Add PGTYPESchar_free() to avoid cross-module problems on Windows.Thomas Munro
On Windows, it is sometimes important for corresponding malloc() and free() calls to be made from the same DLL, since some build options can result in multiple allocators being active at the same time. For that reason we already provided PQfreemem(). This commit adds a similar function for freeing string results allocated by the pgtypes library. Author: Takayuki Tsunakawa Reviewed-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F8AD5D6%40G01JPEXMBYT05
2018-06-18Prevent hard failures of standbys caused by recycled WAL segmentsMichael Paquier
When a standby's WAL receiver stops reading WAL from a WAL stream, it writes data to the current WAL segment without having priorily zero'ed the page currently written to, which can cause the WAL reader to read junk data from a past recycled segment and then it would try to get a record from it. While sanity checks in place provide most of the protection needed, in some rare circumstances, with chances increasing when a record header crosses a page boundary, then the startup process could fail violently on an allocation failure, as follows: FATAL: invalid memory alloc request size XXX This is confusing for the user and also unhelpful as this requires in the worst case a manual restart of the instance, impacting potentially the availability of the cluster, and this also makes WAL data look like it is in a corrupted state. The chances of seeing failures are higher if the connection between the standby and its root node is unstable, causing WAL pages to be written in the middle. A couple of approaches have been discussed, like zero-ing new WAL pages within the WAL receiver itself but this has the disadvantage of impacting performance of any existing instances as this breaks the sequential writes done by the WAL receiver. This commit deals with the problem with a more simple approach, which has no performance impact without reducing the detection of the problem: if a record is found with a length higher than 1GB for backends, then do not try any allocation and report a soft failure which will force the standby to retry reading WAL. It could be possible that the allocation call passes and that an unnecessary amount of memory is allocated, however follow-up checks on records would just fail, making this allocation short-lived anyway. This patch owes a great deal to Tsunakawa Takayuki for reporting the failure first, and then discussing a couple of potential approaches to the problem. Backpatch down to 9.5, which is where palloc_extended has been introduced. Reported-by: Tsunakawa Takayuki Reviewed-by: Tsunakawa Takayuki Author: Michael Paquier Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F8B57AD@G01JPEXMBYT05
2018-06-17Suppress -Wshift-negative-value warnings.Tom Lane
Clean up four places that result in compiler warnings when using recent gcc with this warning class enabled (as seen on buildfarm members calliphoridae, skink, and others). In all these places, this is purely cosmetic, because the shift distance could not be large enough to risk a change of sign, so there's no chance of implementation-dependent behavior. Still, it's easy enough to avoid the warning by casting the shifted value to unsigned, so let's do that. Patch HEAD only, this isn't worth a back-patch.
2018-06-16Avoid unnecessary use of strncpy in a couple of places in ecpg.Tom Lane
Use of strncpy with a length limit based on the source, rather than the destination, is non-idiomatic and draws warnings from gcc 8. Replace with memcpy, which does exactly the same thing in these cases, but with less chance for confusion. Backpatch to all supported branches. Discussion: https://postgr.es/m/21789.1529170195@sss.pgh.pa.us
2018-06-16Use snprintf not sprintf in pg_waldump's timestamptz_to_str.Tom Lane
This could only cause an issue if strftime returned a ridiculously long timezone name, which seems unlikely; and it wouldn't qualify as a security problem even then, since pg_waldump (nee pg_xlogdump) is a debug tool not part of the server. But gcc 8 has started issuing warnings about it, so let's use snprintf and be safe. Backpatch to 9.3 where this code was added. Discussion: https://postgr.es/m/21789.1529170195@sss.pgh.pa.us
2018-06-16Fix some minor error-checking oversights in ParseFuncOrColumn().Tom Lane
Recent additions to ParseFuncOrColumn to make it reject non-procedure functions in CALL were neither adequate nor documented. Reorganize the code to ensure uniform results for all the cases that should be rejected. Also, use ERRCODE_WRONG_OBJECT_TYPE for this case as well as the converse case of a procedure in a non-CALL context. The original coding used ERRCODE_UNDEFINED_FUNCTION which seems wrong, and is certainly inconsistent with the adjacent wrong-kind-of-routine errors. This reorganization also causes the checks for aggregate decoration with a non-aggregate function to be made in the FUNCDETAIL_COERCION case; that they were not is a long-standing oversight. Discussion: https://postgr.es/m/14497.1529089235@sss.pgh.pa.us
2018-06-16Remove AELs from subxids correctly on standbySimon Riggs
Issues relate only to subtransactions that hold AccessExclusiveLocks when replayed on standby. Prior to PG10, aborting subtransactions that held an AccessExclusiveLock failed to release the lock until top level commit or abort. 49bff5300d527 fixed that. However, 49bff5300d527 also introduced a similar bug where subtransaction commit would fail to release an AccessExclusiveLock, leaving the lock to be removed sometimes early and sometimes late. This commit fixes that bug also. Backpatch to PG10 needed. Tested by observation. Note need for multi-node isolationtester to improve test coverage for this and other HS cases. Reported-by: Simon Riggs Author: Simon Riggs
2018-06-16Fix memory leak in BufFileCreateShared().Tatsuo Ishii
Also this commit unifies some duplicated code in makeBufFile() and BufFileOpenShared() into new function makeBufFileCommon(). Author: Antonin Houska Reviewed-By: Thomas Munro, Tatsuo Ishii Discussion: https://postgr.es/m/16139.1529049566%40localhost
2018-06-15Fix off-by-one bug in XactLogCommitRecordAlvaro Herrera
Commit 1eb6d6527aae introduced zeroed alignment bytes in the GID field of commit/abort WAL records. Fixup commit cf5a1890592b later changed that representation into a regular cstring with a single terminating zero byte, but it also introduced an off-by-one mistake. Fix that. Author: Nikhil Sontakke Reported-by: Nikhil Sontakke Discussion: https://postgr.es/m/CAMGcDxey6dG1DP34_tJMoWPcp5sPJUAL4K5CayUUXLQSx2GQpA@mail.gmail.com
2018-06-15Fix memory leak.Tatsuo Ishii
Memory is allocated twice for "file" and "files" variables in BufFileOpenShared(). Author: Antonin Houska Discussion: https://postgr.es/m/11329.1529045692%40localhost