summaryrefslogtreecommitdiff
path: root/src/backend/lib
AgeCommit message (Collapse)Author
2024-12-17Accommodate very large dshash tables.Nathan Bossart
If a dshash table grows very large (e.g., the dshash table for cumulative statistics when there are millions of tables), resizing it may fail with an error like: ERROR: invalid DSA memory alloc request size 1073741824 To fix, permit dshash resizing to allocate more than 1 GB by providing the DSA_ALLOC_HUGE flag. Reported-by: Andreas Scherbaum Author: Matthias van de Meent Reviewed-by: Cédric Villemain, Michael Paquier, Andres Freund Discussion: https://postgr.es/m/80a12d59-0d5e-4c54-866c-e69cd6536471%40pgug.de Backpatch-through: 13
2022-07-11Fix lock assertions in dshash.c.Thomas Munro
dshash.c previously maintained flags to be able to assert that you didn't hold any partition lock. These flags could get out of sync with reality in error scenarios. Get rid of all that, and make assertions about the locks themselves instead. Since LWLockHeldByMe() loops internally, we don't want to put that inside another loop over all partition locks. Introduce a new debugging-only interface LWLockAnyHeldByMe() to avoid that. This problem was noted by Tom and Andres while reviewing changes to support the new shared memory stats system, and later showed up in reality while working on commit 389869af. Back-patch to 11, where dshash.c arrived. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220311012712.botrpsikaufzteyt@alap3.anarazel.de Discussion: https://postgr.es/m/CA%2BhUKGJ31Wce6HJ7xnVTKWjFUWQZPBngxfJVx4q0E98pDr3kAw%40mail.gmail.com
2021-04-20Fix typo in commentMagnus Hagander
Author: Julien Rouhaud Backpatch-through: 11 Discussion: https://postgr.es/m/20210420121659.odjueyd4rpilorn5@nol
2020-07-30Use pg_bitutils for HyperLogLog.Jeff Davis
Using pg_leftmost_one_post32() yields substantial performance benefits. Backpatching to version 13 because HLL is used for HashAgg improvements in 9878b643, which was also backpatched to 13. Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-WzkGvDKVDo+0YvfvZ+1CE=iCi88DCOGFF3i1hTGGaxcKPw@mail.gmail.com Backpatch-through: 13
2020-02-27Move src/backend/utils/hash/hashfn.c to src/commonRobert Haas
This also involves renaming src/include/utils/hashutils.h, which becomes src/include/common/hashfn.h. Perhaps an argument can be made for keeping the hashutils.h name, but it seemed more consistent to make it match the name of the file, and also more descriptive of what is actually going on here. Patch by me, reviewed by Suraj Kharage and Mark Dilger. Off-list advice on how not to break the Windows build from Davinder Singh and Amit Kapila. Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
2020-02-24Put all the prototypes for hashfn.c into the same header file.Robert Haas
Previously, some of the prototypes for functions in hashfn.c were in utils/hashutils.h and others were in utils/hsearch.h, but that is confusing and has no particular benefit. Patch by me, reviewed by Suraj Kharage and Mark Dilger. Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
2020-01-01Update copyrights for 2020Bruce Momjian
Backpatch-through: update all files in master, backpatch legal files through 9.4
2019-11-05Make StringInfo available to frontend code.Andres Freund
There's plenty places in frontend code that could benefit from a string buffer implementation. Some because it yields simpler and faster code, and some others because of the desire to share code between backend and frontend. While there is a string buffer implementation available to frontend code, libpq's PQExpBuffer, it is clunkier than stringinfo, it introduces a libpq dependency, doesn't allow for sharing between frontend and backend code, and has a higher API/ABI stability requirement due to being exposed via libpq. Therefore it seems best to just making StringInfo being usable by frontend code. There's not much to do for that, except for rewriting two subsequent elog/ereport calls into others types of error reporting, and deciding on a maximum string length. For the maximum string size I decided to privately define MaxAllocSize to the same value as used in the backend. It seems likely that we'll want to reconsider this for both backend and frontend code in the not too far away future. For now I've left stringinfo.h in lib/, rather than common/, to reduce the likelihood of unnecessary breakage. We could alternatively decide to provide a redirecting stringinfo.h in lib/, or just not provide compatibility. Author: Andres Freund Reviewed-By: Kyotaro Horiguchi, Daniel Gustafsson Discussion: https://postgr.es/m/20190920051857.2fhnvhvx4qdddviz@alap3.anarazel.de
2019-11-05Split all OBJS style lines in makefiles into one-line-per-entry style.Andres Freund
When maintaining or merging patches, one of the most common sources for conflicts are the list of objects in makefiles. Especially when the split across lines has been changed on both sides, which is somewhat common due to attempting to stay below 80 columns, those conflicts are unnecessarily laborious to resolve. By splitting, and alphabetically sorting, OBJS style lines into one object per line, conflicts should be less frequent, and easier to resolve when they still occur. Author: Andres Freund Discussion: https://postgr.es/m/20191029200901.vww4idgcxv74cwes@alap3.anarazel.de
2019-07-08Fix inconsistencies in the codeMichael Paquier
This addresses a couple of issues in the code: - Typos and inconsistencies in comments and function declarations. - Removal of unreferenced function declarations. - Removal of unnecessary compile flags. - A cleanup error in regressplans.sh. Author: Alexander Lakhin Discussion: https://postgr.es/m/0c991fdf-2670-1997-c027-772a420c4604@gmail.com
2019-06-17Fix more typos and inconsistencies in the treeMichael Paquier
Author: Alexander Lakhin Discussion: https://postgr.es/m/0a5419ea-1452-a4e6-72ff-545b1a5a8076@gmail.com
2019-06-08Fix assorted inconsistencies.Amit Kapila
There were a number of issues in the recent commits which include typos, code and comments mismatch, leftover function declarations. Fix them. Reported-by: Alexander Lakhin Author: Alexander Lakhin, Amit Kapila and Amit Langote Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/ef0c0232-0c1d-3a35-63d4-0ebd06e31387@gmail.com
2019-05-24Update copyright year.Thomas Munro
Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CA%2BhUKGJFWXmtYo6Frd77RR8YXCHz7hJ2mRy5aHV%3D7fJOqDnBHA%40mail.gmail.com
2019-05-22Phase 2 pgindent run for v12.Tom Lane
Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
2019-04-09Fix example in comment.Heikki Linnakangas
Author: Adrien Nayrat
2019-03-25Further code review for new integerset code.Tom Lane
Mostly cosmetic adjustments, but I added a more reliable method of detecting whether an iteration is in progress.
2019-03-25Clean up the Simple-8b encoder code.Heikki Linnakangas
Coverity complained that simple8b_encode() might read beyond the end of the 'diffs' array, in the loop to encode the integers. That was a false positive, because we never get into the loop in modes 0 or 1, and the array is large enough for all the other modes. But I admit it's very subtle, so it's not surprising that Coverity didn't see it, and it's not very obvious to humans either. Refactor it, so that the second loop re-computes the differences, instead of carrying them over from the first loop in the 'diffs' array. This way, the 'diffs' array is not needed anymore. It makes no measurable difference in performance, and seems more straightforward this way. Also, improve the comments in simple8b_encode(): fix the comment about its return value that was flat-out wrong, and explain the condition when it returns EMPTY_CODEWORD better. In the passing, move the 'selector' from the codeword's low bits to the high bits. It doesn't matter much, but looking at the original paper, and googling around for other Simple-8b implementations, that's how it's usually done. Per Coverity, and Tom Lane's report off-list.
2019-03-22Fix yet more portability bugs in integerset and its tests.Heikki Linnakangas
There were more large constants that needed UINT64CONST. And one variable was declared as "int", when it needed to be uint64. These bugs were only visible on 32-bit systems; clearly I should've tested on one, given that this code does a lot of work with 64-bit integers. Also, in the test "huge distances" test, the code created some values with random distances between them, but the test logic didn't take into account the possibility that the random distance was exactly 1. That never actually happens with the seed we're using, but let's be tidy.
2019-03-22Add IntegerSet, to hold large sets of 64-bit ints efficiently.Heikki Linnakangas
The set is implemented as a B-tree, with a compact representation at leaf items, using Simple-8b algorithm, so that clusters of nearby values use less memory. The IntegerSet isn't used for anything yet, aside from the test code, but we have two patches in the works that would benefit from this: A patch to allow GiST vacuum to delete empty pages, and a patch to reduce heap VACUUM's memory usage, by storing the list of dead TIDs more efficiently and lifting the 1 GB limit on its size. This includes a unit test module, in src/test/modules/test_integerset. It can be used to verify correctness, as a regression test, but if you run it manully, it can also print memory usage and execution time of some of the tests. Author: Heikki Linnakangas, Andrey Borodin Reviewed-by: Julien Rouhaud Discussion: https://www.postgresql.org/message-id/b5e82599-1966-5783-733c-1a947ddb729f@iki.fi
2019-03-11Move hash_any prototype from access/hash.h to utils/hashutils.hAlvaro Herrera
... as well as its implementation from backend/access/hash/hashfunc.c to backend/utils/hash/hashfn.c. access/hash is the place for the hash index AM, not really appropriate for generic facilities, which is what hash_any is; having things the old way meant that anything using hash_any had to include the AM's include file, pointlessly polluting its namespace with unrelated, unnecessary cruft. Also move the HTEqual strategy number to access/stratnum.h from access/hash.h. To avoid breaking third-party extension code, add an #include "utils/hashutils.h" to access/hash.h. (An easily removed line by committers who enjoy their asbestos suits to protect them from angry extension authors.) Discussion: https://postgr.es/m/201901251935.ser5e4h6djt2@alvherre.pgsql
2019-02-15Make use of compiler builtins and/or assembly for CLZ, CTZ, POPCNT.Tom Lane
Test for the compiler builtins __builtin_clz, __builtin_ctz, and __builtin_popcount, and make use of these in preference to handwritten C code if they're available. Create src/port infrastructure for "leftmost one", "rightmost one", and "popcount" so as to centralize these decisions. On x86_64, __builtin_popcount generally won't make use of the POPCNT opcode because that's not universally supported yet. Provide code that checks CPUID and then calls POPCNT via asm() if available. This requires indirecting through a function pointer, which is an annoying amount of overhead for a one-instruction operation, but it's probably not worth working harder than this for our current use-cases. I'm not sure we've found all the existing places that could profit from this new infrastructure; but we at least touched all the ones that used copied-and-pasted versions of the bitmapset.c code, and got rid of multiple copies of the associated constant arrays. While at it, replace c-compiler.m4's one-per-builtin-function macros with a single one that can handle all the cases we need to worry about so far. Also, because I'm paranoid, make those checks into AC_LINK checks rather than just AC_COMPILE; the former coding failed to verify that libgcc has support for the builtin, in cases where it's not inline code. David Rowley, Thomas Munro, Alvaro Herrera, Tom Lane Discussion: https://postgr.es/m/CAKJS1f9WTAGG1tPeJnD18hiQW5gAk59fQ6WK-vfdAKEHyRg2RA@mail.gmail.com
2019-02-15Revert attempts to use POPCNT etc instructionsAlvaro Herrera
This reverts commits fc6c72747ae6, 109de05cbb03, d0b4663c23b7 and 711bab1e4d19. Somebody will have to try harder before submitting this patch again. I've spent entirely too much time on it already, and the #ifdef maze yet to be written in order for it to build at all got on my nerves. The amount of work needed to get a platform-specific performance improvement that's barely above the noise level is not worth it.
2019-02-13Add basic support for using the POPCNT and SSE4.2s LZCNT opcodesAlvaro Herrera
These opcodes have been around in the AMD world since 2007, and 2008 in the case of intel. They're supported in GCC and Clang via some __builtin macros. The opcodes may be unavailable during runtime, in which case we fall back on a C-based implementation of the code. In order to get the POPCNT instruction we must pass the -mpopcnt option to the compiler. We do this only for the pg_bitutils.c file. David Rowley (with fragments taken from a patch by Thomas Munro) Discussion: https://postgr.es/m/CAKJS1f9WTAGG1tPeJnD18hiQW5gAk59fQ6WK-vfdAKEHyRg2RA@mail.gmail.com
2019-01-02Update copyright for 2019Bruce Momjian
Backpatch-through: certain files through 9.4
2018-11-06Rename rbtree.c functions to use "rbt" prefix not "rb" prefix.Tom Lane
The "rb" prefix is used by Ruby, so that our existing code results in name collisions that break plruby. We discussed ways to prevent that by adjusting dynamic linker options, but it seems that at best we'd move the pain to other cases. Renaming to avoid the collision is the only portable fix anyway. Fortunately, our rbtree code is not (yet?) widely used --- in core, there's only a single usage in GIN --- so it seems likely that we can get away with a rename. I chose to do this basically as s/rb/rbt/g, except for places where there already was a "t" after "rb". The patch could have been made smaller by only touching linker-visible symbols, but it would have resulted in oddly inconsistent-looking code. Better to make it look like "rbt" was the plan all along. Back-patch to v10. The rbtree.c code exists back to 9.5, but rb_iterate() which is the actual immediate source of pain was added in v10, so it seems like changing the names before that would have more risk than benefit. Per report from Pavel Raiskup. Discussion: https://postgr.es/m/4738198.8KVIIDhgEB@nb.usersys.redhat.com
2018-10-29Remove incorrect comment in dshash.c.Thomas Munro
Back-patch to 11. Author: Antonin Houska Discussion: https://postgr.es/m/8726.1540553521%40localhost
2018-09-26Implement %m in src/port/snprintf.c, and teach elog.c to rely on that.Tom Lane
I started out with the idea that we needed to detect use of %m format specs in contexts other than elog/ereport calls, because we couldn't rely on that working in *printf calls. But a better answer is to fix things so that it does work. Now that we're using snprintf.c all the time, we can implement %m in that and we've fixed the problem. This requires also adjusting our various printf-wrapping functions so that they ensure "errno" is preserved when they call snprintf.c. Remove elog.c's handmade implementation of %m, and let it rely on snprintf to support the feature. That should provide some performance gain, though I've not attempted to measure it. There are a lot of places where we could now simplify 'printf("%s", strerror(errno))' into 'printf("%m")', but I'm not in any big hurry to make that happen. Patch by me, reviewed by Michael Paquier Discussion: https://postgr.es/m/2975.1526862605@sss.pgh.pa.us
2018-07-16doc: Update redirecting linksPeter Eisentraut
Update links that resulted in redirects. Most are changes from http to https, but there are also some other minor edits. (There are still some redirects where the target URL looks less elegant than the one we currently have. I have left those as is.)
2018-05-22Add missing files to src/backend/lib/README.Heikki Linnakangas
The README lists all the files available in the directory, along with short descriptions of each, but a few newly added ones were missing. While we're at it, reorder the list into alphabetical order. Author: Takeshi Ideriha Discussion: https://www.postgresql.org/message-id/4E72940DA2BF16479384A86D54D0988A56793487@G01JPEXMBKW04
2018-04-26Post-feature-freeze pgindent run.Tom Lane
Discussion: https://postgr.es/m/15719.1523984266@sss.pgh.pa.us
2018-03-31Fix non-portable use of round().Andres Freund
round() is from C99. Use rint() instead. There are behavioral differences between round() and rint(), but they should not matter to the Bloom filter optimal_k() function. We already assume POSIX behavior for rint(), so there is no question of rint() not using "rounds towards nearest" as its rounding mode. Cleanup from commit 51bc271790eb234a1ba4d14d3e6530f70de92ab5. Per buildfarm member thrips. Author: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wzn76eCGUonARy-wrVtMHsf+4cvbK_oJAWTLfORTU5ki0w@mail.gmail.com
2018-03-31Add Bloom filter implementation.Andres Freund
A Bloom filter is a space-efficient, probabilistic data structure that can be used to test set membership. Callers will sometimes incur false positives, but never false negatives. The rate of false positives is a function of the total number of elements and the amount of memory available for the Bloom filter. Two classic applications of Bloom filters are cache filtering, and data synchronization testing. Any user of Bloom filters must accept the possibility of false positives as a cost worth paying for the benefit in space efficiency. This commit adds a test harness extension module, test_bloomfilter. It can be used to get a sense of how the Bloom filter implementation performs under varying conditions. This is infrastructure for the upcoming "heapallindexed" amcheck patch, which verifies the consistency of a heap relation against one of its indexes. Author: Peter Geoghegan Reviewed-By: Andrey Borodin, Michael Paquier, Thomas Munro, Andres Freund Discussion: https://postgr.es/m/CAH2-Wzm5VmG7cu1N-H=nnS57wZThoSDQU+F5dewx3o84M+jY=g@mail.gmail.com
2018-03-01Minor clean-up in dshash.{c,h}.Andres Freund
For consistency with other code that deals in numbers of buckets, the macro BUCKETS_PER_PARTITION should produce a value of type size_t. Also, fix a mention of an obsolete proposed name for dshash.c that appeared in a comment. Author: Thomas Munro, based on an observation from Amit Kapila Discussion: https://postgr.es/m/CAA4eK1%2BBOp5aaW3aHEkg5Bptf8Ga_BkBnmA-%3DXcAXShs0yCiYQ%40mail.gmail.com
2018-02-16Remove some inappropriate #includes.Tom Lane
Other header files should never #include postgres.h (nor postgres_fe.h, nor c.h), per project policy. Also, there's no need for any backend .c file to explicitly include elog.h or palloc.h, because postgres.h pulls those in already. Extracted from a larger patch by Kyotaro Horiguchi. The rest of the removals he suggests require more study, but these are no-brainers. Discussion: https://postgr.es/m/20180215.200447.209320006.horiguchi.kyotaro@lab.ntt.co.jp
2018-01-02Update copyright for 2018Bruce Momjian
Backpatch-through: certain files through 9.3
2017-12-13Rethink MemoryContext creation to improve performance.Tom Lane
This patch makes a number of interrelated changes to reduce the overhead involved in creating/deleting memory contexts. The key ideas are: * Include the AllocSetContext header of an aset.c context in its first malloc request, rather than allocating it separately in TopMemoryContext. This means that we now always create an initial or "keeper" block in an aset, even if it never receives any allocation requests. * Create freelists in which we can save and recycle recently-destroyed asets (this idea is due to Robert Haas). * In the common case where the name of a context is a constant string, just store a pointer to it in the context header, rather than copying the string. The first change eliminates a palloc/pfree cycle per context, and also avoids bloat in TopMemoryContext, at the price that creating a context now involves a malloc/free cycle even if the context never receives any allocations. That would be a loser for some common usage patterns, but recycling short-lived contexts via the freelist eliminates that pain. Avoiding copying constant strings not only saves strlen() and strcpy() overhead, but is an essential part of the freelist optimization because it makes the context header size constant. Currently we make no attempt to use the freelist for contexts with non-constant names. (Perhaps someday we'll need to think harder about that, but in current usage, most contexts with custom names are long-lived anyway.) The freelist management in this initial commit is pretty simplistic, and we might want to refine it later --- but in common workloads that will never matter because the freelists will never get full anyway. To create a context with a non-constant name, one is now required to call AllocSetContextCreateExtended and specify the MEMCONTEXT_COPY_NAME option. AllocSetContextCreate becomes a wrapper macro, and it includes a test that will complain about non-string-literal context name parameters on gcc and similar compilers. An unfortunate side effect of making AllocSetContextCreate a macro is that one is now *required* to use the size parameter abstraction macros (ALLOCSET_DEFAULT_SIZES and friends) with it; the pre-9.6 habit of writing out individual size parameters no longer works unless you switch to AllocSetContextCreateExtended. Internally to the memory-context-related modules, the context creation APIs are simplified, removing the rather baroque original design whereby a context-type module called mcxt.c which then called back into the context-type module. That saved a bit of code duplication, but not much, and it prevented context-type modules from exercising control over the allocation of context headers. In passing, I converted the test-and-elog validation of aset size parameters into Asserts to save a few more cycles. The original thought was that callers might compute size parameters on the fly, but in practice nobody does that, so it's useless to expend cycles on checking those numbers in production builds. Also, mark the memory context method-pointer structs "const", just for cleanliness. Discussion: https://postgr.es/m/2264.1512870796@sss.pgh.pa.us
2017-10-11Allow to avoid NUL-byte management for stringinfos and use in format.c.Andres Freund
In a lot of the places having appendBinaryStringInfo() maintain a trailing NUL byte wasn't actually meaningful, e.g. when appending an integer which can contain 0 in one of its bytes. Removing this yields some small speedup, but more importantly will be more consistent when providing faster variants of pq_sendint etc. Author: Andres Freund Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbekae4@alap3.anarazel.de
2017-09-18Fix uninitialized variable in dshash.c.Andres Freund
A bugfix for commit 8c0d7bafad36434cb08ac2c78e69ae72c194ca20. The code would have crashed if hashtable->size_log2 ever had the same value as hashtable->control->size_log2 by coincidence. Per Valgrind. Author: Thomas Munro Reported-By: Tomas Vondra Discussion: https://postgr.es/m/e72fb33c-4f31-f276-e972-263d9b59554d%402ndquadrant.com
2017-09-10Remove pre-order and post-order traversal logic for red-black trees.Tom Lane
This code isn't used, and there's no clear reason why anybody would ever want to use it. These traversal mechanisms don't yield a visitation order that is semantically meaningful for any external purpose, nor are they any faster or simpler than the left-to-right or right-to-left traversals. (In fact, some rough testing suggests they are slower :-(.) Moreover, these mechanisms are impossible to test in any arm's-length fashion; doing so requires knowledge of the red-black tree's internal implementation. Hence, let's just jettison them. Discussion: https://postgr.es/m/17735.1505003111@sss.pgh.pa.us
2017-09-03Suppress compiler warnings in dshash.c.Tom Lane
Some compilers complain, not unreasonably, about left-shifting an int32 "1" and then assigning the result to an int64. In practice I sure hope that this data structure never gets large enough that an overflow would actually occur; but let's cast the constant to the right type to avoid the hazard. In passing, fix a typo in dshash.h. Amit Kapila, adjusted as per comment from Thomas Munro. Discussion: https://postgr.es/m/CAA4eK1+5vfVMYtjK_NX8O3-42yM3o80qdqWnQzGquPrbq6mb+A@mail.gmail.com
2017-08-24Consolidate the function pointer types used by dshash.c.Andres Freund
Commit 8c0d7bafad36434cb08ac2c78e69ae72c194ca20 introduced dshash with hash and compare functions like DynaHash's, and also variants that take a user data pointer instead of size. Simplify the interface by merging them into a single pair of function pointer types that take both size and a user data pointer. Since it is anticipated that memcmp and tag_hash behavior will be a common requirement, provide wrapper functions dshash_memcmp and dshash_memhash that conform to the new function types. Author: Thomas Munro Reviewed-By: Andres Freund Discussion: https://postgr.es/m/20170823054644.efuzftxjpfi6wwqs%40alap3.anarazel.de
2017-08-24Fix unlikely shared memory leak after failure in dshash_create().Andres Freund
Tidy-up for commit 8c0d7bafad36434cb08ac2c78e69ae72c194ca20, based on a complaint from Andres Freund. Author: Thomas Munro Reviewed-By: Andres Freund Discussion: https://postgr.es/m/20170823054644.efuzftxjpfi6wwqs%40alap3.anarazel.de
2017-08-22Hash tables backed by DSA shared memory.Andres Freund
Add general purpose chaining hash tables for DSA memory. Unlike DynaHash in shared memory mode, these hash tables can grow as required, and cope with being mapped into different addresses in different backends. There is a wide range of potential users for such a hash table, though it's very likely the interface will need to evolve as we come to understand the needs of different kinds of users. E.g support for iterators and incremental resizing is planned for later commits and the details of the callback signatures are likely to change. Author: Thomas Munro Reviewed-By: John Gorman, Andres Freund, Dilip Kumar, Robert Haas Discussion: https://postgr.es/m/CAEepm=3d8o8XdVwYT6O=bHKsKAM2pu2D6sV1S_=4d+jStVCE7w@mail.gmail.com https://postgr.es/m/CAEepm=0ZtQ-SpsgCyzzYpsXS6e=kZWqk3g5Ygn3MDV7A8dabUA@mail.gmail.com
2017-06-21Phase 3 of pgindent updates.Tom Lane
Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21Phase 2 of pgindent updates.Tom Lane
Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21Initial pgindent run with pg_bsd_indent version 2.0.Tom Lane
The new indent version includes numerous fixes thanks to Piotr Stefaniak. The main changes visible in this commit are: * Nicer formatting of function-pointer declarations. * No longer unexpectedly removes spaces in expressions using casts, sizeof, or offsetof. * No longer wants to add a space in "struct structname *varname", as well as some similar cases for const- or volatile-qualified pointers. * Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely. * Fixes bug where comments following declarations were sometimes placed with no space separating them from the code. * Fixes some odd decisions for comments following case labels. * Fixes some cases where comments following code were indented to less than the expected column 33. On the less good side, it now tends to put more whitespace around typedef names that are not listed in typedefs.list. This might encourage us to put more effort into typedef name collection; it's not really a bug in indent itself. There are more changes coming after this round, having to do with comment indentation and alignment of lines appearing within parentheses. I wanted to limit the size of the diffs to something that could be reviewed without one's eyes completely glazing over, so it seemed better to split up the changes as much as practical. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-05-17Post-PG 10 beta1 pgindent runBruce Momjian
perltidy run not included.
2017-05-10Revert "Permit dump/reload of not-too-large >1GB tuples"Alvaro Herrera
This reverts commits fa2fa9955280 and 42f50cb8fa98. While the functionality that was intended to be provided by these commits is desired, the patch didn't actually solve as many of the problematic situations as we hoped, and it created a bunch of its own problems. Since we're going to require more extensive changes soon for other reasons and users have been working around these problems for a long time already, there is no point in spending effort in fixing this halfway measure. Per complaint from Tom Lane. Discussion: https://postgr.es/m/21407.1484606922@sss.pgh.pa.us (Commit fa2fa9955280 had already been reverted in branches 9.5 as f858524ee4f and 9.6 as e9e44a0953, so this touches master only. Commit 42f50cb8fa98 was not present in the older branches.)
2017-03-27Support hashed aggregation with grouping sets.Andrew Gierth
This extends the Aggregate node with two new features: HashAggregate can now run multiple hashtables concurrently, and a new strategy MixedAggregate populates hashtables while doing sorted grouping. The planner will now attempt to save as many sorts as possible when planning grouping sets queries, while not exceeding work_mem for the estimated combined sizes of all hashtables used. No SQL-level changes are required. There should be no user-visible impact other than the new EXPLAIN output and possible changes to result ordering when ORDER BY was not used (which affected a few regression tests). The enable_hashagg option is respected. Author: Andrew Gierth Reviewers: Mark Dilger, Andres Freund Discussion: https://postgr.es/m/87vatszyhj.fsf@news-spur.riddles.org.uk
2017-01-10Fix overflow check in StringInfo; add missing castsAlvaro Herrera
A few thinkos I introduced in fa2fa9955280. Also, amend a similarly broken comment. Report by Daniel Vérité. Authors: Daniel Vérité, Álvaro Herrera Discussion: https://postgr.es/m/1706e85e-60d2-494e-8a64-9af1e1b2186e@manitou-mail.org