summaryrefslogtreecommitdiff
path: root/src/backend/utils/adt
AgeCommit message (Collapse)Author
2014-03-17Fix typos in comments.Fujii Masao
Thom Brown
2014-03-13C comments: remove odd blank lines after #ifdef WIN32 linesBruce Momjian
2014-03-12Items on GIN data pages are no longer always 6 bytes; update gincostestimate.Heikki Linnakangas
Also improve the comments a bit.
2014-03-12Allow opclasses to provide tri-valued GIN consistent functions.Heikki Linnakangas
With the GIN "fast scan" feature, GIN can skip items without fetching all the keys for them, if it can prove that they don't match regardless of those keys. So far, it has done the proving by calling the boolean consistent function with all combinations of TRUE/FALSE for the unfetched keys, but since that's O(n^2), it becomes unfeasible with more than a few keys. We can avoid calling consistent with all the combinations, if we can tell the operator class implementation directly which keys are unknown. This commit includes a triConsistent function for the built-in array and tsvector opclasses. Alexander Korotkov, with some changes by me.
2014-03-07Avoid memcpy() with same source and destination address.Heikki Linnakangas
The behavior of that is undefined, although unlikely to lead to problems in practice. Found by running regression tests with Valgrind.
2014-03-06Avoid getting more than AccessShareLock when deparsing a query.Tom Lane
In make_ruledef and get_query_def, we have long used AcquireRewriteLocks to ensure that the querytree we are about to deparse is up-to-date and the schemas of the underlying relations aren't changing. Howwever, that function thinks the query is about to be executed, so it acquires locks that are stronger than necessary for the purpose of deparsing. Thus for example, if pg_dump asks to deparse a rule that includes "INSERT INTO t", we'd acquire RowExclusiveLock on t. That results in interference with concurrent transactions that might for example ask for ShareLock on t. Since pg_dump is documented as being purely read-only, this is unexpected. (Worse, it used to actually be read-only; this behavior dates back only to 8.1, cf commit ba4200246.) Fix this by adding a parameter to AcquireRewriteLocks to tell it whether we want the "real" execution locks or only AccessShareLock. Report, diagnosis, and patch by Dean Rasheed. Back-patch to all supported branches.
2014-03-06isdigit() needs an unsigned char argument.Heikki Linnakangas
Per the C standard, the routine should be passed an int, with a value that's representable as an unsigned char or EOF. Passing a signed char is wrong, because a negative value is not representable as an unsigned char. Unfortunately no compiler warns about that.
2014-03-05Fix portability issues in recently added make_timestamp/make_interval code.Tom Lane
Explicitly reject infinity/NaN inputs, rather than just assuming that something else will do it for us. Per buildfarm. While at it, make some over-parenthesized and under-legible code more readable.
2014-03-04Constructors for interval, timestamp, timestamptzAlvaro Herrera
Author: Pavel Stěhule, editorialized somewhat by Álvaro Herrera Reviewed-by: Tomáš Vondra, Marko Tiikkaja With input from Fabrízio de Royes Mello, Jim Nasby
2014-03-03Another round of Coverity fixesStephen Frost
Additional non-security issues/improvements spotted by Coverity. In backend/libpq, no sense trying to protect against port->hba being NULL after we've already dereferenced it in the switch() statement. Prevent against possible overflow due to 32bit arithmitic in basebackup throttling (not yet released, so no security concern). Remove nonsensical check of array pointer against NULL in procarray.c, looks to be a holdover from 9.1 and earlier when there were pointers being used but now it's just an array. Remove pointer check-against-NULL in tsearch/spell.c as we had already dereferenced it above (in the strcmp()). Remove dead code from adt/orderedsetaggs.c, isnull is checked immediately after each tuplesort_getdatum() call and if true we return, so no point checking it again down at the bottom. Remove recently added minor error-condition memory leak in pg_regress.
2014-03-01Allow regex operations to be terminated early by query cancel requests.Tom Lane
The regex code didn't have any provision for query cancel; which is unsurprising given its non-Postgres origin, but still problematic since some operations can take a long time. Introduce a callback function to check for a pending query cancel or session termination request, and call it in a couple of strategic spots where we can make the regex code exit with an error indicator. If we ever actually split out the regex code as a standalone library, some additional work will be needed to let the cancel callback function be specified externally to the library. But that's straightforward (certainly so by comparison to putting the locale-dependent character classification logic on a similar arms-length basis), and there seems no need to do it right now. A bigger issue is that there may be more places than these two where we need to check for cancels. We can always add more checks later, now that the infrastructure is in place. Since there are known examples of not-terribly-long regexes that can lock up a backend for a long time, back-patch to all supported branches. I have hopes of fixing the known performance problems later, but adding query cancel ability seems like a good idea even if they were all fixed.
2014-02-26Fix crash in json_to_record().Jeff Davis
json_to_record() depends on get_call_result_type() for the tuple descriptor of the record that should be returned, but in some cases that cannot be determined. Add a guard to check if the tuple descriptor has been properly resolved, similar to other callers of get_call_result_type(). Also add guard for two other callers of get_call_result_type() in jsonfuncs.c. Although json_to_record() is the only actual bug, it's a good idea to follow convention.
2014-02-25Use SnapshotDirty rather than an active snapshot to probe index endpoints.Tom Lane
If there are lots of uncommitted tuples at the end of the index range, get_actual_variable_range() ends up fetching each one and doing an MVCC visibility check on it, until it finally hits a visible tuple. This is bad enough in isolation, considering that we don't need an exact answer only an approximate one. But because the tuples are not yet committed, each visibility check does a TransactionIdIsInProgress() test, which involves scanning the ProcArray. When multiple sessions do this concurrently, the ensuing contention results in horrid performance loss. 20X overall throughput loss on not-too-complicated queries is easy to demonstrate in the back branches (though someone's made it noticeably less bad in HEAD). We can dodge the problem fairly effectively by using SnapshotDirty rather than a normal MVCC snapshot. This will cause the index probe to take uncommitted tuples as good, so that we incur only one tuple fetch and test even if there are many such tuples. The extent to which this degrades the estimate is debatable: it's possible the result is actually a more accurate prediction than before, if the endmost tuple has become committed by the time we actually execute the query being planned. In any case, it's not very likely that it makes the estimate a lot worse. SnapshotDirty will still reject tuples that are known committed dead, so we won't give bogus answers if an invalid outlier has been deleted but not yet vacuumed from the index. (Because btrees know how to mark such tuples dead in the index, we shouldn't have a big performance problem in the case that there are many of them at the end of the range.) This consideration motivates not using SnapshotAny, which was also considered as a fix. Note: the back branches were using SnapshotNow instead of an MVCC snapshot, but the problem and solution are the same. Per performance complaints from Bartlomiej Romanski, Josh Berkus, and others. Back-patch to 9.0, where the issue was introduced (by commit 40608e7f949fb7e4025c0ddd5be01939adc79eec).
2014-02-25Show xid and xmin in pg_stat_activity and pg_stat_replication.Robert Haas
Christian Kruse, reviewed by Andres Freund and myself, with further minor adjustments by me.
2014-02-24Allow single-point polygons to be converted to circlesBruce Momjian
This allows finding the center of a single-point polygon and converting it to a point. Per report from Josef Grahn
2014-02-24docs: document behavior of CHAR() comparisons with chars < spaceBruce Momjian
Space trimming rather than space-padding causes unusual behavior, which might not be standards-compliant. Also remove recently-added now-redundant C comment.
2014-02-23Prefer pg_any_to_server/pg_server_to_any over pg_do_encoding_conversion.Tom Lane
A large majority of the callers of pg_do_encoding_conversion were specifying the database encoding as either source or target of the conversion, meaning that we can use the less general functions pg_any_to_server/pg_server_to_any instead. The main advantage of using the latter functions is that they can make use of a cached conversion-function lookup in the common case that the other encoding is the current client_encoding. It's notationally cleaner too in most cases, not least because of the historical artifact that the latter functions use "char *" rather than "unsigned char *" in their APIs. Note that pg_any_to_server will apply an encoding verification step in some cases where pg_do_encoding_conversion would have just done nothing. This seems to me to be a good idea at most of these call sites, though it partially negates the performance benefit. Per discussion of bug #9210.
2014-02-21Do ScalarArrayOp estimation correctly when array is a stable expression.Tom Lane
Most estimation functions apply estimate_expression_value to see if they can reduce an expression to a constant; the key difference is that it allows evaluation of stable as well as immutable functions in hopes of ending up with a simple Const node. scalararraysel didn't get the memo though, and neither did gincost_opexpr/gincost_scalararrayopexpr. Fix that, and remove a now-unnecessary estimate_expression_value step in the subsidiary function scalararraysel_containment. Per complaint from Alexey Klyukin. Back-patch to 9.3. The problem goes back further, but I'm hesitant to change estimation behavior in long-stable release branches.
2014-02-19Further code review for pg_lsn data type.Robert Haas
Change input function error messages to be more consistent with what is done elsewhere. Remove a bunch of redundant type casts, so that the compiler will warn us if we screw up. Don't pass LSNs by value on platforms where a Datum is only 32 bytes, per buildfarm. Move macros for packing and unpacking LSNs to pg_lsn.h so that we can include access/xlogdefs.h, to avoid an unsatisfied dependency on XLogRecPtr.
2014-02-19pg_lsn macro naming and type behavior revisions.Robert Haas
Change pg_lsn_mi so that it can return negative values when subtracting LSNs, and clean up some perhaps ill-considered macro names.
2014-02-19Add a pg_lsn data type, to represent an LSN.Robert Haas
Robert Haas and Michael Paquier
2014-02-17Prevent potential overruns of fixed-size buffers.Tom Lane
Coverity identified a number of places in which it couldn't prove that a string being copied into a fixed-size buffer would fit. We believe that most, perhaps all of these are in fact safe, or are copying data that is coming from a trusted source so that any overrun is not really a security issue. Nonetheless it seems prudent to forestall any risk by using strlcpy() and similar functions. Fixes by Peter Eisentraut and Jozef Mlich based on Coverity reports. In addition, fix a potential null-pointer-dereference crash in contrib/chkpass. The crypt(3) function is defined to return NULL on failure, but chkpass.c didn't check for that before using the result. The main practical case in which this could be an issue is if libc is configured to refuse to execute unapproved hashing algorithms (e.g., "FIPS mode"). This ideally should've been a separate commit, but since it touches code adjacent to one of the buffer overrun changes, I included it in this commit to avoid last-minute merge issues. This issue was reported by Honza Horak. Security: CVE-2014-0065 for buffer overruns, CVE-2014-0066 for crypt()
2014-02-17Predict integer overflow to avoid buffer overruns.Noah Misch
Several functions, mostly type input functions, calculated an allocation size such that the calculation wrapped to a small positive value when arguments implied a sufficiently-large requirement. Writes past the end of the inadvertent small allocation followed shortly thereafter. Coverity identified the path_in() vulnerability; code inspection led to the rest. In passing, add check_stack_depth() to prevent stack overflow in related functions. Back-patch to 8.4 (all supported versions). The non-comment hstore changes touch code that did not exist in 8.4, so that part stops at 9.0. Noah Misch and Heikki Linnakangas, reviewed by Tom Lane. Security: CVE-2014-0064
2014-02-17Shore up ADMIN OPTION restrictions.Noah Misch
Granting a role without ADMIN OPTION is supposed to prevent the grantee from adding or removing members from the granted role. Issuing SET ROLE before the GRANT bypassed that, because the role itself had an implicit right to add or remove members. Plug that hole by recognizing that implicit right only when the session user matches the current role. Additionally, do not recognize it during a security-restricted operation or during execution of a SECURITY DEFINER function. The restriction on SECURITY DEFINER is not security-critical. However, it seems best for a user testing his own SECURITY DEFINER function to see the same behavior others will see. Back-patch to 8.4 (all supported versions). The SQL standards do not conflate roles and users as PostgreSQL does; only SQL roles have members, and only SQL users initiate sessions. An application using PostgreSQL users and roles as SQL users and roles will never attempt to grant membership in the role that is the session user, so the implicit right to add or remove members will never arise. The security impact was mostly that a role member could revoke access from others, contrary to the wishes of his own grantor. Unapproved role member additions are less notable, because the member can still largely achieve that by creating a view or a SECURITY DEFINER function. Reviewed by Andres Freund and Tom Lane. Reported, independently, by Jonas Sundman and Noah Misch. Security: CVE-2014-0060
2014-02-13Add C comment about problems with CHAR() space trimmingBruce Momjian
2014-02-06Alphabeticize list in OBJS definition in utils/adt Makefile.Andrew Dunstan
2014-02-05Fix whitespacePeter Eisentraut
2014-02-04Fix comparison of an array of characters with zero to compare with '\0' instead.Fujii Masao
Report from Andres Freund.
2014-02-03In json code, clean up temp memory contexts after processing.Andrew Dunstan
Craig Ringer.
2014-02-01arrays: tighten checks for multi-dimensional inputBruce Momjian
Previously an input array string that started with a single-element array dimension would then later accept a multi-dimensional segment. BACKWARD INCOMPATIBILITY
2014-01-30Add checks for interval overflow/underflowBruce Momjian
New checks include input, month/day/time internal adjustments, addition, subtraction, multiplication, and negation. Also adjust docs to correctly specify interval size in bytes. Report from Rok Kralj
2014-01-29Silence compiler warnings about possibly unset variables.Andrew Dunstan
They are in fact set in every case where they are needed, but the compiler doesn't know that. Per gripe from Tom Lane.
2014-01-29Add json_array_elements_text function.Andrew Dunstan
This was a notable omission from the json functions added in 9.3 and there have been numerous complaints about its absence. Laurence Rowe.
2014-01-28New json functions.Andrew Dunstan
json_build_array() and json_build_object allow for the construction of arbitrarily complex json trees. json_object() turns a one or two dimensional array, or two separate arrays, into a json_object of name/value pairs, similarly to the hstore() function. json_object_agg() aggregates its two arguments into a single json object as name value pairs. Catalog version bumped. Andrew Dunstan, reviewed by Marko Tiikkaja.
2014-01-29Add pg_stat_archiver statistics view.Fujii Masao
This view shows the statistics about the WAL archiver process's activity. Gabriele Bartolini, reviewed by Michael Paquier, refactored a bit by me.
2014-01-26Enable building with Visual Studion 2013.Andrew Dunstan
Backpatch to 9.3. Brar Piening.
2014-01-23Make DROP IF EXISTS more consistently not failAlvaro Herrera
Some cases were still reporting errors and aborting, instead of a NOTICE that the object was being skipped. This makes it more difficult to cleanly handle pg_dump --clean, so change that to instead skip missing objects properly. Per bug #7873 reported by Dave Rolsky; apparently this affects a large number of users. Authors: Pavel Stehule and Dean Rasheed. Some tweaks by Álvaro Herrera
2014-01-22Reindent json.c and jsonfuncs.c.Andrew Dunstan
This will help in preparation of clean patches for upcoming json work.
2014-01-21Add a cardinality function for arrays.Robert Haas
Unlike our other array functions, this considers the total number of elements across all dimensions, and returns 0 rather than NULL when the array has no elements. But it seems that both of those behaviors are almost universally disliked, so hopefully that's OK. Marko Tiikkaja, reviewed by Dean Rasheed and Pavel Stehule
2014-01-20Fix to_timestamp/to_date's handling of consecutive spaces in format string.Tom Lane
When there are consecutive spaces (or other non-format-code characters) in the format, we should advance over exactly that many characters of input. The previous coding mistakenly did a "skip whitespace" action between such characters, possibly allowing more input to be skipped than the user intended. We only need to skip whitespace just before an actual field. This is really a bug fix, but given the minimal number of field complaints and the risk of breaking applications coded to expect the old behavior, let's not back-patch it. Jeevan Chalke
2014-01-18Make various variables const (read-only).Tom Lane
These changes should generally improve correctness/maintainability. A nice side benefit is that several kilobytes move from initialized data to text segment, allowing them to be shared across processes and probably reducing copy-on-write overhead while forking a new backend. Unfortunately this doesn't seem to help libpq in the same way (at least not when it's compiled with -fpic on x86_64), but we can hope the linker at least collects all nominally-const data together even if it's not actually part of the text segment. Also, make pg_encname_tbl[] static in encnames.c, since there seems no very good reason for any other code to use it; per a suggestion from Wim Lewis, who independently submitted a patch that was mostly a subset of this one. Oskari Saarenmaa, with some editorialization by me
2014-01-09Remove unnecessary local variables to work around an icc optimization bug.Tom Lane
Buildfarm member dunlin has been crashing since commit 8b49a60, but other machines seem fine with that code. It turns out that removing the local variables in ordered_set_startup() that are copies of fields in "qstate" dodges the problem. This might cost a few cycles on register-rich machines, but it's probably a wash on others, and in any case this code isn't performance-critical. Thanks to Jeremy Drake for off-list investigation.
2014-01-08Avoid extra AggCheckCallContext() checks in ordered-set aggregates.Tom Lane
In the transition functions, we don't really need to recheck this after the first call. I had been feeling paranoid about possibly getting a non-null argument value in some other context; but it's probably game over anyway if we have a non-null "internal" value that's not what we are expecting. In the final functions, the general convention in pre-existing final functions seems to be that an Assert() is good enough, so do it like that here too. This seems to save a few tenths of a percent of overall query runtime, which isn't much, but still it's just overhead if there's not a plausible case where the checks would fire.
2014-01-07Update copyright for 2014Bruce Momjian
Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.
2014-01-06Add more use of psprintf()Peter Eisentraut
2014-01-05Cache catalog lookup data across groups in ordered-set aggregates.Tom Lane
The initial commit of ordered-set aggregates just did all the setup work afresh each time the aggregate function is started up. But in a GROUP BY query, the catalog lookups need not be repeated for each group, since the column datatypes and sort information won't change. When there are many small groups, this makes for a useful, though not huge, performance improvement. Per suggestion from Andrew Gierth. Profiling of these cases suggests that it might be profitable to avoid duplicate lookups within tuplesort startup as well; but changing the tuplesort APIs would have much broader impact, so I left that for another day.
2014-01-04Fix header comment for bitncmp().Tom Lane
The result is an int less than, equal to, or greater than zero, in the style of memcmp (and, in fact, exactly the output of memcmp in some cases). This comment previously said -1, 1, or 0, which was an overspecification, as noted by Emre Hasegeli. All of the existing callers appear to be fine with the actual behavior, so just fix the comment. In passing, improve infelicitous formatting of some call sites.
2013-12-27Properly detect invalid JSON numbers when generating JSON.Andrew Dunstan
Instead of looking for characters that aren't valid in JSON numbers, we simply pass the output string through the JSON number parser, and if it fails the string is quoted. This means among other things that money and domains over money will be quoted correctly and generate valid JSON. Fixes bug #8676 reported by Anderson Cristian da Silva. Backpatched to 9.2 where JSON generation was introduced.
2013-12-27Fix misplaced right paren bugs in pgstatfuncs.c.Kevin Grittner
The bug would only show up if the C sockaddr structure contained zero in the first byte for a valid address; otherwise it would fail to fail, which is probably why it went unnoticed for so long. Patch submitted by Joel Jacobson after seeing an article by Andrey Karpov in which he reports finding this through static code analysis using PVS-Studio. While I was at it I moved a definition of a local variable referenced in the buggy code to a more local context. Backpatch to all supported branches.
2013-12-23Fix ANALYZE failure on a column that's a domain over a range.Tom Lane
Most other range operations seem to work all right on domains, but this one not so much, at least not since commit 918eee0c. Per bug #8684 from Brett Neumeier.