summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2017-09-08Clean up excessive codePeter Eisentraut
The encoding ID was converted between string and number too many times, probably a remnant from the shell script days. Reviewed-by: Aleksandr Parfenov <a.parfenov@postgrespro.ru>
2017-09-08Remove useless empty string initializationsPeter Eisentraut
This coding style probably stems from the days of shell scripts. Reviewed-by: Aleksandr Parfenov <a.parfenov@postgrespro.ru>
2017-09-08Remove useless dead codePeter Eisentraut
Reviewed-by: Aleksandr Parfenov <a.parfenov@postgrespro.ru>
2017-09-08Fix assorted portability issues in new pgbench TAP tests.Tom Lane
* Our own version of getopt_long doesn't support abbreviation of long options. * It doesn't do automatic rearrangement of non-option arguments to the end, either. * Test was way too optimistic about the platform independence of NaN and Infinity outputs. I rather imagine we might have to lose those tests altogether, but for the moment just allow case variation and fully spelled out Infinity. Per buildfarm.
2017-09-08Add much-more-extensive TAP tests for pgbench.Tom Lane
Fabien Coelho, reviewed by Nikolay Shaplov and myself Discussion: https://postgr.es/m/alpine.DEB.2.20.1704171422500.4025@lancre
2017-09-07Refactor get_partition_for_tuple a bit.Robert Haas
Pending patches for both default partitioning and hash partitioning find the current coding pattern to be inconvenient. Change it so that we switch on the partitioning method first and then do whatever is needed. Amul Sul, reviewed by Jeevan Ladhe, with a few adjustments by me. Discussion: http://postgr.es/m/CAAJ_b97mTb=dG2pv6+1ougxEVZFVnZJajW+0QHj46mEE7WsoOQ@mail.gmail.com Discussion: http://postgr.es/m/CAOgcT0M37CAztEinpvjJc18EdHfm23fw0EG9-36Ya=+rEFUqaQ@mail.gmail.com
2017-09-07Improve performance of get_actual_variable_range with recently-dead tuples.Tom Lane
In commit fccebe421, we hacked get_actual_variable_range() to scan the index with SnapshotDirty, so that if there are many uncommitted tuples at the end of the index range, it wouldn't laboriously scan through all of them looking for a live value to return. However, that didn't fix it for the case of many recently-dead tuples at the end of the index; SnapshotDirty recognizes those as committed dead and so we're back to the same problem. To improve the situation, invent a "SnapshotNonVacuumable" snapshot type and use that instead. The reason this helps is that, if the snapshot rejects a given index entry, we know that the indexscan will mark that index entry as killed. This means the next get_actual_variable_range() scan will proceed past that entry without visiting the heap, making the scan a lot faster. We may end up accepting a recently-dead tuple as being the estimated extremal value, but that doesn't seem much worse than the compromise we made before to accept not-yet-committed extremal values. The cost of the scan is still proportional to the number of dead index entries at the end of the range, so in the interval after a mass delete but before VACUUM's cleaned up the mess, it's still possible for get_actual_variable_range() to take a noticeable amount of time, if you've got enough such dead entries. But the constant factor is much much better than before, since all we need to do with each index entry is test its "killed" bit. We chose to back-patch commit fccebe421 at the time, but I'm hesitant to do so here, because this form of the problem seems to affect many fewer people. Also, even when it happens, it's less bad than the case fixed by commit fccebe421 because we don't get the contention effects from expensive TransactionIdIsInProgress tests. Dmitriy Sarafannikov, reviewed by Andrey Borodin Discussion: https://postgr.es/m/05C72CF7-B5F6-4DB9-8A09-5AC897653113@yandex.ru
2017-09-07Reduce excessive dereferencing of function pointersPeter Eisentraut
It is equivalent in ANSI C to write (*funcptr) () and funcptr(). These two styles have been applied inconsistently. After discussion, we'll use the more verbose style for plain function pointer variables, to make it clear that it's a variable, and the shorter style when the function pointer is in a struct (s.func() or s->func()), because then it's clear that it's not a plain function name, and otherwise the excessive punctuation makes some of those invocations hard to read. Discussion: https://www.postgresql.org/message-id/f52c16db-14ed-757d-4b48-7ef360b1631d@2ndquadrant.com
2017-09-07Even if some partitions are foreign, allow tuple routing.Robert Haas
This doesn't allow routing tuple to the foreign partitions themselves, but it permits tuples to be routed to regular partitions despite the presence of foreign partitions in the same inheritance hierarchy. Etsuro Fujita, reviewed by Amit Langote and by me. Discussion: http://postgr.es/m/bc3db4c1-1693-3b8a-559f-33ad2b50b7ad@lab.ntt.co.jp
2017-09-07Fix handling of savepoint commands within multi-statement Query strings.Tom Lane
Issuing a savepoint-related command in a Query message that contains multiple SQL statements led to a FATAL exit with a complaint about "unexpected state STARTED". This is a shortcoming of commit 4f896dac1, which attempted to prevent such misbehaviors in multi-statement strings; its quick hack of marking the individual statements as "not top-level" does the wrong thing in this case, and isn't a very accurate description of the situation anyway. To fix, let's introduce into xact.c an explicit model of what happens for multi-statement Query strings. This is an "implicit transaction block in progress" state, which for many purposes works like the normal TBLOCK_INPROGRESS state --- in particular, IsTransactionBlock returns true, causing the desired result that PreventTransactionChain will throw error. But in case of error abort it works like TBLOCK_STARTED, allowing the transaction to be cancelled without need for an explicit ROLLBACK command. Commit 4f896dac1 is reverted in toto, so that we go back to treating the individual statements as "top level". We could have left it as-is, but this allows sharpening the error message for PreventTransactionChain calls inside functions. Except for getting a normal error instead of a FATAL exit for savepoint commands, this patch should result in no user-visible behavioral change (other than that one error message rewording). There are some things we might want to do in the line of changing the appearance or wording of error and warning messages around this behavior, which would be much simpler to do now that it's an explicitly modeled state. But I haven't done them here. Although this fixes a long-standing bug, no backpatch. The consequences of the bug don't seem severe enough to justify the risk that this commit itself creates some new issue. Patch by me, but it owes something to previous investigation by Takayuki Tsunakawa, who also reported the bug in the first place. Also thanks to Michael Paquier for reviewing. Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F6BE40D@G01JPEXMBYT05
2017-09-07Further marginal hacking on generic atomic ops.Tom Lane
In the generic atomic ops that rely on a loop around a CAS primitive, there's no need to force the initial read of the "old" value to be atomic. In the typically-rare case that we get a torn value, that simply means that the first CAS attempt will fail; but it will update "old" to the atomically-read value, so the next attempt has a chance of succeeding. It was already being done that way in pg_atomic_exchange_u64_impl(), but let's duplicate the approach in the rest. (Given the current coding of the pg_atomic_read functions, this change is a no-op anyway on popular platforms; it only makes a difference where pg_atomic_read_u64_impl() is implemented as a CAS.) In passing, also remove unnecessary take-a-pointer-and-dereference-it coding in the pg_atomic_read functions. That seems to have been based on a misunderstanding of what the C standard requires. What actually matters is that the pointer be declared as pointing to volatile, which it is. I don't believe this will change the assembly code at all on x86 platforms (even ignoring the likelihood that these implementations get overridden by others); but it may help on less-mainstream CPUs. Discussion: https://postgr.es/m/13707.1504718238@sss.pgh.pa.us
2017-09-07Exclude special values in recovery_target_timeSimon Riggs
recovery_target_time accepts timestamp input, though does not allow use of special values, e.g. “today”. Report a useful error message for these cases. Reported-by: Piotr Stefaniak Author: Simon Riggs Discussion: https://postgr.es/m/CANP8+jJdKA+BkkYLWz9zAm16Y0s2ExBv0WfpAwXdTpPfWnA9Bg@mail.gmail.com
2017-09-06Merge duplicative code for \sf/\sv, \ef/\ev in psql/command.c.Tom Lane
Saves ~150 lines, costs little. Fabien Coelho, reviewed by Victor Drobny Discussion: https://postgr.es/m/alpine.DEB.2.20.1703311958001.14355@lancre
2017-09-06Allow SET STATISTICS on expression indexesSimon Riggs
Index columns are referenced by ordinal number rather than name, e.g. CREATE INDEX coord_idx ON measured (x, y, (z + t)); ALTER INDEX coord_idx ALTER COLUMN 3 SET STATISTICS 1000; Incompatibility note for release notes: \d+ for indexes now also displays Stats Target Authors: Alexander Korotkov, with contribution by Adrien NAYRAT Review: Adrien NAYRAT, Simon Riggs Wordsmith: Simon Riggs
2017-09-06Use more of gcc's __sync_fetch_and_xxx builtin functions for atomic ops.Tom Lane
In addition to __sync_fetch_and_add, gcc offers __sync_fetch_and_sub, __sync_fetch_and_and, and __sync_fetch_and_or, which correspond directly to primitive atomic ops that we want. Testing shows that in some cases they generate better code than our generic implementations, so use them. We've assumed that configure's test for __sync_val_compare_and_swap is sufficient to allow assuming that __sync_fetch_and_add is available, so make the same assumption for these functions. Should that prove to be wrong, we can add more configure tests. Yura Sokolov, reviewed by Jesper Pedersen and myself Discussion: https://postgr.es/m/7f65886daca545067f82bf2b463b218d@postgrespro.ru
2017-09-06Remove duplicate reads from the inner loops in generic atomic ops.Tom Lane
The pg_atomic_compare_exchange_xxx functions are defined to update *expected to whatever they read from the target variable. Therefore, there's no need to do additional explicit reads after we've initialized the "old" variable. The actual benefit of this is somewhat debatable, but it seems fairly unlikely to hurt anything, especially since we will override the generic implementations in most performance-sensitive cases. Yura Sokolov, reviewed by Jesper Pedersen and myself Discussion: https://postgr.es/m/7f65886daca545067f82bf2b463b218d@postgrespro.ru
2017-09-06Clean up handling of dropped columns in NAMEDTUPLESTORE RTEs.Tom Lane
The NAMEDTUPLESTORE patch piggybacked on the infrastructure for TABLEFUNC/VALUES/CTE RTEs, none of which can ever have dropped columns, so the possibility was ignored most places. Fix that, including adding a specification to parsenodes.h about what it's supposed to look like. In passing, clean up assorted comments that hadn't been maintained properly by said patch. Per bug #14799 from Philippe Beaudoin. Back-patch to v10. Discussion: https://postgr.es/m/20170906120005.25630.84360@wrigleys.postgresql.org
2017-09-05Add \gdesc psql command.Tom Lane
This command acts somewhat like \g, but instead of executing the query buffer, it merely prints a description of the columns that the query result would have. (Of course, this still requires parsing the query; if parse analysis fails, you get an error anyway.) We accomplish this using an unnamed prepared statement, which should be invisible to psql users. Pavel Stehule, reviewed by Fabien Coelho Discussion: https://postgr.es/m/CAFj8pRBhYVvO34FU=EKb=nAF5t3b++krKt1FneCmR0kuF5m-QA@mail.gmail.com
2017-09-05Use lfirst_node() and linitial_node() where appropriate in planner.c.Tom Lane
There's no particular reason to target this module for the first wholesale application of these macros; but we gotta start somewhere. Ashutosh Bapat and Jeevan Chalke Discussion: https://postgr.es/m/CAFjFpRcNr3r=u0ni=7A4GD9NnHQVq+dkFafzqo2rS6zy=dt1eg@mail.gmail.com
2017-09-05Remove endof macroPeter Eisentraut
It has not been used in a long time, and it doesn't seem safe anyway, so drop it. Reviewed-by: Michael Paquier <michael.paquier@gmail.com> Reviewed-by: Ryan Murphy <ryanfmurphy@gmail.com>
2017-09-05Remove unnecessary parentheses in return statementsPeter Eisentraut
The parenthesized style has only been used in a few modules. Change that to use the style that is predominant across the whole tree. Reviewed-by: Michael Paquier <michael.paquier@gmail.com> Reviewed-by: Ryan Murphy <ryanfmurphy@gmail.com>
2017-09-05Remove our own definition of NULLPeter Eisentraut
Surely everyone has that by now. Reviewed-by: Michael Paquier <michael.paquier@gmail.com> Reviewed-by: Ryan Murphy <ryanfmurphy@gmail.com>
2017-09-05Support retaining data dirs on successful TAP testsPeter Eisentraut
This moves the data directories from using temporary directories with randomness in the directory name to a static name, to make it easier to debug. The data directory will be retained if tests fail or the test code dies/exits with failure, and is automatically removed on the next make check. If the environment variable PG_TEST_NOCLEAN is defined, the data directories will be retained regardless of test or exit status. Author: Daniel Gustafsson <daniel@yesql.se>
2017-09-05In psql, use PSQL_PAGER in preference to PAGER, if it's set.Tom Lane
This allows the user's environment to set up a psql-specific choice of pager, in much the same way that we provide PSQL_EDITOR to allow a psql-specific override of the more widely known EDITOR variable. Pavel Stehule, reviewed by Thomas Munro Discussion: https://postgr.es/m/CAFj8pRD3RRk9S1eRbnGm_T6brc3Ss5mohraNzTSJquzx+pmtKA@mail.gmail.com
2017-09-05Correct base backup throttlingAlvaro Herrera
Throttling for sending a base backup in walsender is broken for the case where there is a lot of WAL traffic, because the latch used to put the walsender to sleep is also signalled by regular WAL traffic (and each signal causes an additional batch of data to be sent); the net effect is that there is no or little actual throttling. This is undesirable, so rewrite the sleep into a loop to achieve the desired effeect. Author: Jeff Janes, small tweaks by me Reviewed-by: Antonin Houska Discussion: https://postgr.es/m/CAMkU=1xH6mde-yL-Eo1TKBGNd0PB1-TMxvrNvqcAkN-qr2E9mw@mail.gmail.com
2017-09-05Add psql variables showing server version and psql version.Tom Lane
We already had a psql variable VERSION that shows the verbose form of psql's own version. Add VERSION_NAME to show the short form (e.g., "11devel") and VERSION_NUM to show the numeric form (e.g., 110000). Also add SERVER_VERSION_NAME and SERVER_VERSION_NUM to show the short and numeric forms of the server's version. (We'd probably add SERVER_VERSION with the verbose string if it were readily available; but adding another network round trip to get it seems too expensive.) The numeric forms, in particular, are expected to be useful for scripting purposes, now that psql can do conditional tests. Fabien Coelho, reviewed by Pavel Stehule Discussion: https://postgr.es/m/alpine.DEB.2.20.1704020917220.4632@lancre
2017-09-05Reformat psql's --help=variables output.Tom Lane
The previous format with variable names and descriptions in separate columns was extremely constraining about the length of the descriptions. We'd dealt with that in several inconsistent ways over the years, including letting the lines run over 80 characters, breaking descriptions into multiple lines, or shoving the description onto a separate line. But it's been a long time since the output could realistically fit onto a single screen vertically, so let's just rely even more heavily on the pager to deal with the vertical distance, and split each entry into two (or more) lines, in the format variable-name variable description goes here Each variable name + description remains a single translatable string, in hopes of reducing translator confusion; we're just changing the embedded whitespace. I failed to resist the temptation to copy-edit one or two of the descriptions while at it. Discussion: https://postgr.es/m/2947.1504542679@sss.pgh.pa.us
2017-09-04Be more careful about newline-chomping in pgbench.Tom Lane
process_backslash_command would drop the last character of the input command on the assumption that it was a newline. Given a non newline terminated input file, this could result in dropping the last character of the command. Fix that by doing an actual test that we're removing a newline. While at it, allow for Windows newlines (\r\n), and suppress multiple newlines if any. I do not think either of those cases really occur, since (a) we read script files in text mode and (b) the lexer stops when it hits a newline. But it's cheap enough and it provides a stronger guarantee about what the result string looks like. This is just cosmetic, I think, since the possibly-overly-chomped line was only used for display not for further processing. So it doesn't seem necessary to back-patch. Fabien Coelho, reviewed by Nikolay Shaplov, whacked around a bit by me Discussion: https://postgr.es/m/alpine.DEB.2.20.1704171422500.4025@lancre
2017-09-04Fix some subtle problems in pgbench transaction stats counting.Tom Lane
With --latency-limit, transactions might get skipped even beyond the transaction count limit specified by -t, throwing off the expected number of transactions and thus the denominator for later stats. Be sure to stop skipping transactions once -t is reached. Also, include skipped transactions in the "cnt" fields; this requires discounting them again in a couple of places, but most places are better off with this definition. In particular this is needed to get correct overall stats for the combination of -R/-L/-t options. Merge some more processing into processXactStats() to simplify this. In passing, add a check that --progress-timestamp is specified only when --progress is. We might consider back-patching this, but given that it only matters for a combination of options, and given the lack of field complaints, consensus seems to be not to bother. Fabien Coelho, reviewed by Nikolay Shaplov Discussion: https://postgr.es/m/alpine.DEB.2.20.1704171422500.4025@lancre
2017-09-04Adjust pgbench to allow non-ASCII characters in variable names.Tom Lane
This puts it in sync with psql's notion of what is a valid variable name. Like psql, we document that "non-Latin letters" are allowed, but actually any non-ASCII character is accepted. Fabien Coelho Discussion: https://postgr.es/m/20170405.094548.1184280384967203518.t-ishii@sraoss.co.jp
2017-09-04Fix translatable stringAlvaro Herrera
Discussion: https://postgr.es/m/20170828130545.sdajqlpr37hmmd6a@alvherre.pgsql
2017-09-03Suppress compiler warnings in dshash.c.Tom Lane
Some compilers complain, not unreasonably, about left-shifting an int32 "1" and then assigning the result to an int64. In practice I sure hope that this data structure never gets large enough that an overflow would actually occur; but let's cast the constant to the right type to avoid the hazard. In passing, fix a typo in dshash.h. Amit Kapila, adjusted as per comment from Thomas Munro. Discussion: https://postgr.es/m/CAA4eK1+5vfVMYtjK_NX8O3-42yM3o80qdqWnQzGquPrbq6mb+A@mail.gmail.com
2017-09-03Fix macro-redefinition warning on MSVC.Tom Lane
In commit 9d6b160d7, I tweaked pg_config.h.win32 to use "#define HAVE_LONG_LONG_INT_64 1" rather than defining it as empty, for consistency with what happens in an autoconf'd build. But Solution.pm injects another definition of that macro into ecpg_config.h, leading to justifiable (though harmless) compiler whining. Make that one consistent too. Back-patch, like the previous patch. Discussion: https://postgr.es/m/CAEepm=1dWsXROuSbRg8PbKLh0S=8Ou-V8sr05DxmJOF5chBxqQ@mail.gmail.com
2017-09-01Improve division of labor between execParallel.c and nodeGather[Merge].c.Tom Lane
Move the responsibility for creating/destroying TupleQueueReaders into execParallel.c, to avoid duplicative coding in nodeGather.c and nodeGatherMerge.c. Also, instead of having DestroyTupleQueueReader do shm_mq_detach, do it in the caller (which is now only ExecParallelFinish). This means execParallel.c does both the attaching and detaching of the tuple-queue-reader shm_mqs, which seems less weird than the previous arrangement. These changes also eliminate a vestigial memory leak (of the pei->tqueue array). It's now demonstrable that rescans of Gather or GatherMerge don't leak memory. Discussion: https://postgr.es/m/8670.1504192177@sss.pgh.pa.us
2017-09-01Add memory info to getrusage outputPeter Eisentraut
Add the maxrss field to the getrusage output (log_*_stats). This was previously omitted because of portability concerns, but we feel this might not be a concern anymore. based on patch by Justin Pryzby <pryzby@telsasoft.com>
2017-09-01Tighten up some code in RelationBuildPartitionDesc.Robert Haas
This probably doesn't save anything meaningful in terms of performance, but making the code simpler is a good idea anyway. Code by Beena Emerson, extracted from a larger patch by Jeevan Ladhe, slightly adjusted by me. Discussion: http://postgr.es/m/CAOgcT0ONgwajdtkoq+AuYkdTPY9cLWWLjxt_k4SXue3eieAr+g@mail.gmail.com
2017-09-01Make [U]INT64CONST safe for use in #if conditions.Tom Lane
Instead of using a cast to force the constant to be the right width, assume we can plaster on an L, UL, LL, or ULL suffix as appropriate. The old approach to this is very hoary, dating from before we were willing to require compilers to have working int64 types. This fix makes the PG_INT64_MIN, PG_INT64_MAX, and PG_UINT64_MAX constants safe to use in preprocessor conditions, where a cast doesn't work. Other symbolic constants that might be defined using [U]INT64CONST are likewise safer than before. Also fix the SIZE_MAX macro to be similarly safe, if we are forced to provide a definition for that. The test added in commit 2e70d6b5e happens to do what we want even with the hack "(size_t) -1" definition, but we could easily get burnt on other tests in future. Back-patch to all supported branches, like the previous commits. Discussion: https://postgr.es/m/15883.1504278595@sss.pgh.pa.us
2017-09-01Ensure SIZE_MAX can be used throughout our code.Tom Lane
Pre-C99 platforms may lack <stdint.h> and thereby SIZE_MAX. We have a couple of places using the hack "(size_t) -1" as a fallback, but it wasn't universally available; which means the code added in commit 2e70d6b5e fails to compile everywhere. Move that hack to c.h so that we can rely on having SIZE_MAX everywhere. Per discussion, it'd be a good idea to make the macro's value safe for use in #if-tests, but that will take a bit more work. This is just a quick expedient to get the buildfarm green again. Back-patch to all supported branches, like the previous commit. Discussion: https://postgr.es/m/15883.1504278595@sss.pgh.pa.us
2017-09-01pg_dumpall: Add a -E flag to set the encoding, like pg_dump has.Robert Haas
Michael Paquier, reviewed by Fabien Coelho Discussion: http://postgr.es/m/CAB7nPqQcYWmrm2n-dVaMUhYPKFU_DxQwPuUGuC4ZF+8B=dS5xQ@mail.gmail.com
2017-09-01Use group updates when setting transaction status in clog.Robert Haas
Commit 0e141c0fbb211bdd23783afa731e3eef95c9ad7a introduced a mechanism to reduce contention on ProcArrayLock by having a single process clear XIDs in the procArray on behalf of multiple processes, reducing the need to hand the lock around. A previous attempt to introduce a similar mechanism for CLogControlLock in ccce90b398673d55b0387b3de66639b1b30d451b crashed and burned, but the design problem which resulted in those failures is believed to have been corrected in this version. Amit Kapila, with some cosmetic changes by me. See the previous commit message for additional credits. Discussion: http://postgr.es/m/CAA4eK1KudxzgWhuywY_X=yeSAhJMT4DwCjroV5Ay60xaeB2Eew@mail.gmail.com
2017-09-01Fix two-phase commit test for recovery modeAlvaro Herrera
The original code had a race condition because it never ensured the standby was caught up before proceeding; add a wait similar to every other place that does this. Author: Michaël Paquier Discussion: https://postgr.es/m/CAB7nPqTm9p+LCm1mVJYvgpwagRK+uibT-pKq0O2-paOWxT62jw@mail.gmail.com
2017-09-01Restore behavior for replication origin dropAlvaro Herrera
Do for replication origins what the previous commit did for replication slots: restore the original behavior of replication origin drop to raise an error rather than blocking, because users might be depending on the original behavior. Maintain the blocking behavior when invoked internally from logical replication subscription handling. Discussion: https://postgr.es/m/20170830133922.tlpo3lgfejm4n2cs@alvherre.pgsql
2017-09-01Avoid race condition in logical replication testSimon Riggs
Wait for slot to become inactive before continuing. Author: Petr Jelinek
2017-09-01Add a WAIT option to DROP_REPLICATION_SLOTAlvaro Herrera
Commit 9915de6c1cb2 changed the default behavior of DROP_REPLICATION_SLOT so that it would wait until any session holding the slot active would release it, instead of raising an error. But users are already depending on the original behavior, so revert to it by default and add a WAIT option to invoke the new behavior. Per complaint from Simone Gotti, in Discussion: https://postgr.es/m/CAEvsy6Wgdf90O6pUvg2wSVXL2omH5OPC-38OD4Zzgk-FXavj3Q@mail.gmail.com
2017-09-01Fix assorted carelessness about Datum vs. int64 vs. uint64Robert Haas
Bugs introduced by commit 81c5e46c490e2426db243eada186995da5bb0ba7
2017-08-31Try to repair poorly-considered code in previous commit.Robert Haas
2017-08-31Introduce 64-bit hash functions with a 64-bit seed.Robert Haas
This will be useful for hash partitioning, which needs a way to seed the hash functions to avoid problems such as a hash index on a hash partitioned table clumping all values into a small portion of the bucket space; it's also useful for anything that wants a 64-bit hash value rather than a 32-bit hash value. Just in case somebody wants a 64-bit hash value that is compatible with the existing 32-bit hash values, make the low 32-bits of the 64-bit hash value match the 32-bit hash value when the seed is 0. Robert Haas and Amul Sul Discussion: http://postgr.es/m/CA+Tgmoafx2yoJuhCQQOL5CocEi-w_uG4S2xT0EtgiJnPGcHW3g@mail.gmail.com
2017-08-31Avoid memory leaks when a GatherMerge node is rescanned.Tom Lane
Rescanning a GatherMerge led to leaking some memory in the executor's query-lifespan context, because most of the node's working data structures were simply abandoned and rebuilt from scratch. In practice, this might never amount to much, given the cost of relaunching worker processes --- but it's still pretty messy, so let's fix it. We can rearrange things so that the tuple arrays are simply cleared and reused, and we don't need to rebuild the TupleTableSlots either, just clear them. One small complication is that because we might get a different number of workers on each iteration, we can't keep the old convention that the leader's gm_slots[] entry is the last one; the leader might clobber a TupleTableSlot that we need for a worker in a future iteration. Hence, adjust the logic so that the leader has slot 0 always, while the active workers have slots 1..n. Back-patch to v10 to keep all the existing versions of nodeGatherMerge.c in sync --- because of the renumbering of the slots, there would otherwise be a very large risk that any future backpatches in this module would introduce bugs. Discussion: https://postgr.es/m/8670.1504192177@sss.pgh.pa.us
2017-08-31Expand partitioned tables in PartDesc order.Robert Haas
Previously, we expanded the inheritance hierarchy in the order in which find_all_inheritors had locked the tables, but that turns out to block quite a bit of useful optimization. For example, a partition-wise join can't count on two tables with matching bounds to get expanded in the same order. Where possible, this change results in expanding partitioned tables in *bound* order. Bound order isn't well-defined for a list-partitioned table with a null-accepting partition or for a list-partitioned table where the bounds for a single partition are interleaved with other partitions. However, when expansion in bound order is possible, it opens up further opportunities for optimization, such as strength-reducing MergeAppend to Append when the expansion order matches the desired sort order. Patch by me, with cosmetic revisions by Ashutosh Bapat. Discussion: http://postgr.es/m/CA+TgmoZrKj7kEzcMSum3aXV4eyvvbh9WD=c6m=002WMheDyE3A@mail.gmail.com
2017-08-31Clean up shm_mq cleanup.Tom Lane
The logic around shm_mq_detach was a few bricks shy of a load, because (contrary to the comments for shm_mq_attach) all it did was update the shared shm_mq state. That left us leaking a bit of process-local memory, but much worse, the on_dsm_detach callback for shm_mq_detach was still armed. That means that whenever we ultimately detach from the DSM segment, we'd run shm_mq_detach again for already-detached, possibly long-dead queues. This accidentally fails to fail today, because we only ever re-use a shm_mq's memory for another shm_mq, and multiple detach attempts on the last such shm_mq are fairly harmless. But it's gonna bite us someday, so let's clean it up. To do that, change shm_mq_detach's API so it takes a shm_mq_handle not the underlying shm_mq. This makes the callers simpler in most cases anyway. Also fix a few places in parallel.c that were just pfree'ing the handle structs rather than doing proper cleanup. Back-patch to v10 because of the risk that the revenant shm_mq_detach callbacks would cause a live bug sometime. Since this is an API change, it's too late to do it in 9.6. (We could make a variant patch that preserves API, but I'm not excited enough to do that.) Discussion: https://postgr.es/m/8670.1504192177@sss.pgh.pa.us