summaryrefslogtreecommitdiff
path: root/src/include/lib
AgeCommit message (Collapse)Author
2016-12-06Revert "Permit dump/reload of not-too-large >1GB tuples"Alvaro Herrera
This reverts commit 646655d264f17cf7fdbc6425ef8bc9a2f9f9ee41. Per Tom Lane, changing the definition of StringInfoData amounts to an ABI break, which is unacceptable in back branches.
2016-12-02Permit dump/reload of not-too-large >1GB tuplesAlvaro Herrera
Our documentation states that our maximum field size is 1 GB, and that our maximum row size of 1.6 TB. However, while this might be attainable in theory with enough contortions, it is not workable in practice; for starters, pg_dump fails to dump tables containing rows larger than 1 GB, even if individual columns are well below the limit; and even if one does manage to manufacture a dump file containing a row that large, the server refuses to load it anyway. This commit enables dumping and reloading of such tuples, provided two conditions are met: 1. no single column is larger than 1 GB (in output size -- for bytea this includes the formatting overhead) 2. the whole row is not larger than 2 GB There are three related changes to enable this: a. StringInfo's API now has two additional functions that allow creating a string that grows beyond the typical 1GB limit (and "long" string). ABI compatibility is maintained. We still limit these strings to 2 GB, though, for reasons explained below. b. COPY now uses long StringInfos, so that pg_dump doesn't choke trying to emit rows longer than 1GB. c. heap_form_tuple now uses the MCXT_ALLOW_HUGE flag in its allocation for the input tuple, which means that large tuples are accepted on input. Note that at this point we do not apply any further limit to the input tuple size. The main reason to limit to 2 GB is that the FE/BE protocol uses 32 bit length words to describe each row; and because the documentation is ambiguous on its signedness and libpq does consider it signed, we cannot use the highest-order bit. Additionally, the StringInfo API uses "int" (which is 4 bytes wide in most platforms) in many places, so we'd need to change that API too in order to improve, which has lots of fallout. Backpatch to 9.5, which is the oldest that has MemoryContextAllocExtended, a necessary piece of infrastructure. We could apply to 9.4 with very minimal additional effort, but any further than that would require backpatching "huge" allocations too. This is the largest set of changes we could find that can be back-patched without breaking compatibility with existing systems. Fixing a bigger set of problems (for example, dumping tuples bigger than 2GB, or dumping fields bigger than 1GB) would require changing the FE/BE protocol and/or changing the StringInfo API in an ABI-incompatible way, neither of which would be back-patchable. Authors: Daniel Vérité, Álvaro Herrera Reviewed by: Tomas Vondra Discussion: https://postgr.es/m/20160229183023.GA286012@alvherre.pgsql
2015-08-23Avoid use of float arithmetic in bipartite_match.c.Tom Lane
Since the distances used in this algorithm are small integers (not more than the size of the U set, in fact), there is no good reason to use float arithmetic for them. Use short ints instead: they're smaller, faster, and require no special portability assumptions. Per testing by Greg Stark, which disclosed that the code got into an infinite loop on VAX for lack of IEEE-style float infinities. We don't really care all that much whether Postgres can run on a VAX anymore, but there seems sufficient reason to change this code anyway. In passing, make a few other small adjustments to make the code match usual Postgres coding style a bit better.
2015-05-23pgindent run for 9.5Bruce Momjian
2015-05-16Support GROUPING SETS, CUBE and ROLLUP.Andres Freund
This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-03-26Tweak __attribute__-wrapping macros for better pgindent results.Tom Lane
This improves on commit bbfd7edae5aa5ad5553d3c7e102f2e450d4380d4 by making two simple changes: * pg_attribute_noreturn now takes parentheses, ie pg_attribute_noreturn(). Likewise pg_attribute_unused(), pg_attribute_packed(). This reduces pgindent's tendency to misformat declarations involving them. * attributes are now always attached to function declarations, not definitions. Previously some places were taking creative shortcuts, which were not merely candidates for bad misformatting by pgindent but often were outright wrong anyway. (It does little good to put a noreturn annotation where callers can't see it.) In any case, if we would like to believe that these macros can be used with non-gcc compilers, we should avoid gratuitous variance in usage patterns. I also went through and manually improved the formatting of a lot of declarations, and got rid of excessively repetitive (and now obsolete anyway) comments informing the reader what pg_attribute_printf is for.
2015-03-11Add macros wrapping all usage of gcc's __attribute__.Andres Freund
Until now __attribute__() was defined to be empty for all compilers but gcc. That's problematic because it prevents using it in other compilers; which is necessary e.g. for atomics portability. It's also just generally dubious to do so in a header as widely included as c.h. Instead add pg_attribute_format_arg, pg_attribute_printf, pg_attribute_noreturn macros which are implemented in the compilers that understand them. Also add pg_attribute_noreturn and pg_attribute_packed, but don't provide fallbacks, since they can affect functionality. This means that external code that, possibly unwittingly, relied on __attribute__ defined to be empty on !gcc compilers may now run into warnings or errors on those compilers. But there shouldn't be many occurances of that and it's hard to work around... Discussion: 54B58BA3.8040302@ohmu.fi Author: Oskari Saarenmaa, with some minor changes by me.
2015-02-17Fix a bug in pairing heap removal code.Heikki Linnakangas
After removal, the next_sibling pointer of a node was sometimes incorrectly left to point to another node in the heap, which meant that a node was sometimes linked twice into the heap. Surprisingly that didn't cause any crashes in my testing, but it was clearly wrong and could easily segfault in other scenarios. Also always keep the prev_or_parent pointer as NULL on the root node. That was not a correctness issue AFAICS, but let's be tidy. Add a debugging function, to dump the contents of a pairing heap as a string. It's #ifdef'd out, as it's not used for anything in any normal code, but it was highly useful in debugging this. Let's keep it handy for further reference.
2015-01-19Use abbreviated keys for faster sorting of text datums.Robert Haas
This commit extends the SortSupport infrastructure to allow operator classes the option to provide abbreviated representations of Datums; in the case of text, we abbreviate by taking the first few characters of the strxfrm() blob. If the abbreviated comparison is insufficent to resolve the comparison, we fall back on the normal comparator. This can be much faster than the old way of doing sorting if the first few bytes of the string are usually sufficient to resolve the comparison. There is the potential for a performance regression if all of the strings to be sorted are identical for the first 8+ characters and differ only in later positions; therefore, the SortSupport machinery now provides an infrastructure to abort the use of abbreviation if it appears that abbreviation is producing comparatively few distinct keys. HyperLogLog, a streaming cardinality estimator, is included in this commit and used to make that determination for text. Peter Geoghegan, reviewed by me.
2015-01-17Advance backend's advertised xmin more aggressively.Heikki Linnakangas
Currently, a backend will reset it's PGXACT->xmin value when it doesn't have any registered snapshots left. That covered the common case that a transaction in read committed mode runs several queries, one after each other, as there would be no snapshots active between those queries. However, if you hold cursors across each of the query, we didn't get a chance to reset xmin. To make that better, keep all the registered snapshots in a pairing heap, ordered by xmin so that it's always quick to find the snapshot with the smallest xmin. That allows us to advance PGXACT->xmin whenever the oldest snapshot is deregistered, even if there are others still active. Per discussion originally started by Jeff Davis back in 2009 and more recently by Robert Haas.
2015-01-06Update copyright for 2015Bruce Momjian
Backpatch certain files through 9.0
2014-12-22Move rbtree.c from src/backend/utils/misc to src/backend/lib.Heikki Linnakangas
We have other general-purpose data structures in src/backend/lib, so it seems like a better home for the red-black tree as well.
2014-12-22Use a pairing heap for the priority queue in kNN-GiST searches.Heikki Linnakangas
This performs slightly better, uses less memory, and needs slightly less code in GiST, than the Red-Black tree previously used. Reviewed by Peter Geoghegan
2014-05-06pgindent run for 9.4Bruce Momjian
This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.
2014-01-07Update copyright for 2014Bruce Momjian
Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.
2013-10-24Use improved vsnprintf calling logic in more places.Tom Lane
When we are using a C99-compliant vsnprintf implementation (which should be most places, these days) it is worth the trouble to make use of its report of how large the buffer needs to be to succeed. This patch adjusts stringinfo.c and some miscellaneous usages in pg_dump to do that, relying on the logic recently added in libpgcommon's psprintf.c. Since these places want to know the number of bytes written once we succeed, modify the API of pvsnprintf() to report that. There remains near-duplicate logic in pqexpbuffer.c, but since that code is in libpq, psprintf.c's approach of exit()-on-error isn't appropriate for use there. Also note that I didn't bother touching the multitude of places that call (v)snprintf without any attempt to provide a resizable buffer. Release-note-worthy incompatibility: the API of appendStringInfoVA() changed. If there's any third-party code that's calling that directly, it will need tweaking along the same lines as in this patch. David Rowley and Tom Lane
2013-08-30Reset the binary heap in MergeAppend rescans.Tom Lane
Failing to do so can cause queries to return wrong data, error out or crash. This requires adding a new binaryheap_reset() method to binaryheap.c, but that probably should have been there anyway. Per bug #8410 from Terje Elde. Diagnosis and patch by Andres Freund.
2013-07-24Improve ilist.h's support for deletion of slist elements during iteration.Tom Lane
Previously one had to use slist_delete(), implying an additional scan of the list, making this infrastructure considerably less efficient than traditional Lists when deletion of element(s) in a long list is needed. Modify the slist_foreach_modify() macro to support deleting the current element in O(1) time, by keeping a "prev" pointer in addition to "cur" and "next". Although this makes iteration with this macro a bit slower, no real harm is done, since in any scenario where you're not going to delete the current list element you might as well just use slist_foreach instead. Improve the comments about when to use each macro. Back-patch to 9.3 so that we'll have consistent semantics in all branches that provide ilist.h. Note this is an ABI break for callers of slist_foreach_modify(). Andres Freund and Tom Lane
2013-05-29pgindent run for release 9.3Bruce Momjian
This is the first run of the Perl-based pgindent script. Also update pgindent instructions.
2013-01-01Update copyrights for 2013Bruce Momjian
Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.
2012-11-29Basic binary heap implementation.Robert Haas
There are probably other places where this can be used, but for now, this just makes MergeAppend use it, so that this code will have test coverage. There is other work in the queue that will use this, as well. Abhijit Menon-Sen, reviewed by Andres Freund, Robert Haas, Álvaro Herrera, Tom Lane, and others.
2012-11-27Add explicit casts in ilist.h's inline functions.Tom Lane
Needed to silence C++ errors, per report from Peter Eisentraut. Andres Freund
2012-10-18Remove unnecessary "head" arguments from some dlist/slist functions.Tom Lane
dlist_delete, dlist_insert_after, dlist_insert_before, slist_insert_after do not need access to the list header, and indeed insisting on that negates one of the main advantages of a doubly-linked list. In consequence, revert addition of "cache_bucket" field to CatCTup.
2012-10-18Code review for inline-list patch.Tom Lane
Make foreach macros less syntactically dangerous, and fix some typos in evidently-never-tested ones. Add missing slist_next_node and slist_head_node functions. Fix broken dlist_check code. Assorted comment improvements.
2012-10-17Embedded list interfaceAlvaro Herrera
Provide a common implementation of embedded singly-linked and doubly-linked lists. "Embedded" in the sense that the nodes' next/previous pointers exist within some larger struct; this design choice reduces memory allocation overhead. Most of the implementation uses inlineable functions (where supported), for performance. Some existing uses of both types of lists have been converted to the new code, for demonstration purposes. Other uses can (and probably will) be converted in the future. Since dllist.c is unused after this conversion, it has been removed. Author: Andres Freund Some tweaks by me Reviewed by Tom Lane, Peter Geoghegan
2012-06-10Run pgindent on 9.2 source tree in preparation for first 9.3Bruce Momjian
commit-fest.
2012-01-01Update copyright notices for year 2012.Bruce Momjian
2011-09-10Add missing format attributesPeter Eisentraut
Add __attribute__ decorations for printf format checking to the places that were missing them. Fix the resulting warnings. Add -Wmissing-format-attribute to the standard set of warnings for GCC, so these don't happen again. The warning fixes here are relatively harmless. The one serious problem discovered by this was already committed earlier in cf15fb5cabfbc71e07be23cfbc813daee6c5014f.
2011-04-28Use a macro variable PG_PRINTF_ATTRIBUTE for the style used for checking ↵Andrew Dunstan
printf type functions. The style is set to "printf" for backwards compatibility everywhere except on Windows, where it is set to "gnu_printf", which eliminates hundreds of false error messages from modern versions of gcc arising from %m and %ll{d,u} formats.
2011-01-01Stamp copyrights for year 2011.Bruce Momjian
2010-09-20Remove cvs keywords from all files.Magnus Hagander
2010-01-02Update copyright for the year 2010.Bruce Momjian
2009-07-24Assorted minor refactoring in EXPLAIN.Tom Lane
This is believed to not change the output at all, with one known exception: "Subquery Scan foo" becomes "Subquery Scan on foo". (We can fix that if anyone complains, but it would be a wart, because the old code was clearly inconsistent.) The main intention is to remove duplicate coding and provide a cleaner base for subsequent EXPLAIN patching. Robert Haas
2009-01-01Update copyright for 2009.Bruce Momjian
2008-01-01Update copyrights in source tree to 2008.Bruce Momjian
2007-03-03Add resetStringInfo(), which clears the content of a StringInfo, andNeil Conway
fixup various places in the tree that were clearing a StringInfo by hand. Making this function a part of the API simplifies client code slightly, and avoids needlessly peeking inside the StringInfo interface.
2007-01-05Update CVS HEAD for 2007 copyright. Back branches are typically notBruce Momjian
back-stamped for this.
2006-03-05Update copyright for 2006. Update scripts.Bruce Momjian
2005-10-15Standard pgindent run for 8.1.Bruce Momjian
2004-12-31Tag appropriate files for rc3PostgreSQL Daemon
Also performed an initial run through of upgrading our Copyright date to extend to 2005 ... first run here was very simple ... change everything where: grep 1996-2004 && the word 'Copyright' ... scanned through the generated list with 'less' first, and after, to make sure that I only picked up the right entries ...
2004-08-29Update copyright to 2004.Bruce Momjian
2004-05-14Remove an unused (and empty) header file.Neil Conway
2003-11-29make sure the $Id tags are converted to $PostgreSQL as well ...PostgreSQL Daemon
2003-08-04Update copyrights to 2003.Bruce Momjian
2003-08-04pgindent run.Bruce Momjian
2003-04-24Infrastructure for upgraded error reporting mechanism. elog.c isTom Lane
rewritten and the protocol is changed, but most elog calls are still elog calls. Also, we need to contemplate mechanisms for controlling all this functionality --- eg, how much stuff should appear in the postmaster log? And what API should libpq expose for it?
2003-04-19Second round of FE/BE protocol changes. Frontend->backend messages nowTom Lane
have length counts, and COPY IN data is packetized into messages.
2002-06-20Update copyright to 2002.Bruce Momjian
2001-11-05New pgindent run with fixes suggested by Tom. Patch manually reviewed,Bruce Momjian
initdb/regression tests pass.
2001-10-28Another pgindent run. Fixes enum indenting, and improves #endifBruce Momjian
spacing. Also adds space for one-line comments.