summaryrefslogtreecommitdiff
path: root/src/backend/utils/adt
AgeCommit message (Collapse)Author
2011-11-08Make DatumGetInetP() unpack inet datums with a 1-byte header, and addHeikki Linnakangas
a new macro, DatumGetInetPP(), that does not. This brings these macros in line with other DatumGet*P() macros. Backpatch to 8.3, where 1-byte header varlenas were introduced.
2011-10-29Fix assorted bogosities in cash_in() and cash_out().Tom Lane
cash_out failed to handle multiple-byte thousands separators, as per bug #6277 from Alexander Law. In addition, cash_in didn't handle that either, nor could it handle multiple-byte positive_sign. Both routines failed to support multiple-byte mon_decimal_point, which I did not think was worth changing, but at least now they check for the possibility and fall back to using '.' rather than emitting invalid output. Also, make cash_in handle trailing negative signs, which formerly it would reject. Since cash_out generates trailing negative signs whenever the locale tells it to, this last omission represents a fail-to-reload-dumped-data bug. IMO that justifies patching this all the way back.
2011-09-07Fix corner case bug in numeric to_char().Tom Lane
Trailing-zero stripping applied by the FM specifier could strip zeroes to the left of the decimal point, for a format with no digit positions after the decimal point (such as "FM999."). Reported and diagnosed by Marti Raudsepp, though I didn't use his patch.
2011-09-01Further repair of eqjoinsel ndistinct-clamping logic.Tom Lane
Examination of examples provided by Mark Kirkwood and others has convinced me that actually commit 7f3eba30c9d622d1981b1368f2d79ba0999cdff2 was quite a few bricks shy of a load. The useful part of that patch was clamping ndistinct for the inner side of a semi or anti join, and the reason why that's needed is that it's the only way that restriction clauses eliminating rows from the inner relation can affect the estimated size of the join result. I had not clearly understood why the clamping was appropriate, and so mis-extrapolated to conclude that we should clamp ndistinct for the outer side too, as well as for both sides of regular joins. These latter actions were all wrong, and are reverted with this patch. In addition, the clamping logic is now made to affect the behavior of both paths in eqjoinsel_semi, with or without MCV lists to compare. When we have MCVs, we suppose that the most common values are the ones that are most likely to survive the decimation resulting from a lower restriction clause, so we think of the clamping as eliminating non-MCV values, or potentially even the least-common MCVs for the inner relation. Back-patch to 8.4, same as previous fixes in this area.
2011-08-31Improve eqjoinsel's ndistinct clamping to work for multiple levels of join.Tom Lane
This patch fixes an oversight in my commit 7f3eba30c9d622d1981b1368f2d79ba0999cdff2 of 2008-10-23. That patch accounted for baserel restriction clauses that reduced the number of rows coming out of a table (and hence the number of possibly-distinct values of a join variable), but not for join restriction clauses that might have been applied at a lower level of join. To account for the latter, look up the sizes of the min_lefthand and min_righthand inputs of the current join, and clamp with those in the same way as for the base relations. Noted while investigating a complaint from Ben Chobot, although this in itself doesn't seem to explain his report. Back-patch to 8.4; previous versions used different estimation methods for which this heuristic isn't relevant.
2011-08-26Fix potential memory clobber in tsvector_concat().Tom Lane
tsvector_concat() allocated its result workspace using the "conservative" estimate of the sum of the two input tsvectors' sizes. Unfortunately that wasn't so conservative as all that, because it supposed that the number of pad bytes required could not grow. Which it can, as per test case from Jesper Krogh, if there's a mix of lexemes with positions and lexemes without them in the input data. The fix is to assume that we might add a not-previously-present pad byte for each and every lexeme in the two inputs; which really is conservative, but it doesn't seem worthwhile to try to be more precise. This is an aboriginal bug in tsvector_concat, so back-patch to all versions containing it.
2011-06-17Add overflow checks to int4 and int8 versions of generate_series().Robert Haas
The previous code went into an infinite loop after overflow. In fact, an overflow is not really an error; it just means that the current value is the last one we need to return. So, just arrange to stop immediately when overflow is detected. Back-patch all the way.
2011-05-28Fix null-dereference crash in parse_xml_decl().Tom Lane
parse_xml_decl's header comment says you can pass NULL for any unwanted output parameter, but it failed to honor this contract for the "standalone" flag. The only currently-affected caller is xml_recv, so the net effect is that sending a binary XML value containing a standalone parameter in its xml declaration would crash the backend. Per bug #6044 from Christopher Dillard. In passing, remove useless initializations of parse_xml_decl's output parameters in xml_parse. Back-patch to 8.3, where this code was introduced.
2011-05-26Make decompilation of optimized CASE constructs more robust.Tom Lane
We had some hacks in ruleutils.c to cope with various odd transformations that the optimizer could do on a CASE foo WHEN "CaseTestExpr = RHS" clause. However, the fundamental impossibility of covering all cases was exposed by Heikki, who pointed out that the "=" operator could get replaced by an inlined SQL function, which could contain nearly anything at all. So give up on the hacks and just print the expression as-is if we fail to recognize it as "CaseTestExpr = RHS". (We must cover that case so that decompiled rules print correctly; but we are not under any obligation to make EXPLAIN output be 100% valid SQL in all cases, and already could not do so in some other cases.) This approach requires that we have some printable representation of the CaseTestExpr node type; I used "CASE_TEST_EXPR". Back-patch to all supported branches, since the problem case fails in all.
2011-05-24Avoid uninitialized bits in the result of QTN2QT().Tom Lane
Found with additional valgrind testing. Noah Misch
2011-04-29Rewrite pg_size_pretty() to avoid compiler bug.Tom Lane
Convert it to use successive shifts right instead of increasing a divisor. This is probably a tad more efficient than the original coding, and it's nicer-looking than the previous patch because we don't need a special case to avoid overflow in the last branch. But the real reason to do it is to avoid a Solaris compiler bug, as per results from buildfarm member moa.
2011-04-27Fix array- and path-creating functions to ensure padding bytes are zeroes.Tom Lane
Per recent discussion, it's important for all computed datums (not only the results of input functions) to not contain any ill-defined (uninitialized) bits. Failing to ensure that can result in equal() reporting that semantically indistinguishable Consts are not equal, which in turn leads to bizarre and undesirable planner behavior, such as in a recent example from David Johnston. We might eventually try to fix this in a general manner by allowing datatypes to define identity-testing functions, but for now the path of least resistance is to expect datatypes to force all unused bits into consistent states. Per some testing by Noah Misch, array and path functions seem to be the only ones presenting risks at the moment, so I looked through all the functions in adt/array*.c and geo_ops.c and fixed them as necessary. In the array functions, the easiest/safest fix is to allocate result arrays with palloc0 instead of palloc. Possibly in future someone will want to look into whether we can just zero the padding bytes, but that looks too complex for a back-patchable fix. In the path functions, we already had a precedent in path_in for just zeroing the one known pad field, so duplicate that code as needed. Back-patch to all supported branches.
2011-04-25Fix pg_size_pretty() to avoid overflow for inputs close to INT64_MAX.Tom Lane
The expression that tried to round the value to the nearest TB could overflow, leading to bogus output as reported in bug #5993 from Nicola Cossu. This isn't likely to ever happen in the intended usage of the function (if it could, we'd be needing to use a wider datatype instead); but it's not hard to give the expected output, so let's do so.
2011-04-12Be more wary of missing statistics in eqjoinsel_semi().Tom Lane
In particular, if we don't have real ndistinct estimates for both sides, fall back to assuming that half of the left-hand rows have join partners. This is what was done in 8.2 and 8.3 (cf nulltestsel() in those versions). It's pretty stupid but it won't lead us to think that an antijoin produces no rows out, as seen in recent example from Uwe Schroeder.
2011-03-11On further reflection, we'd better do the same in int.c.Tom Lane
We previously heard of the same problem in int24div(), so there's not a good reason to suppose the problem is confined to cases involving int8.
2011-03-11Put in some more safeguards against executing a division-by-zero.Tom Lane
Add dummy returns before every potential division-by-zero in int8.c, because apparently further "improvements" in gcc's optimizer have enabled it to break functions that weren't broken before. Aurelien Jarno, via Martin Pitt
2011-02-01Fix wrong error reports in 'number of array dimensions exceeds theItagaki Takahiro
maximum allowed' messages, that have reported one-less dimensions. Alexey Klyukin
2011-01-17Fix miscalculation of itemsafter in array_set_slice().Tom Lane
If the slice to be assigned to was before the existing array lower bound (requiring at least one null element to spring into existence to fill the gap), the code miscalculated how many entries needed to be copied from the old array's null bitmap. This could result in trashing the array's data area (as seen in bug #5840 from Karsten Loesing), or worse. This has been broken since we first allowed the behavior of assigning to non-adjacent slices, in 8.2. Back-patch to all affected versions.
2010-12-28Avoid unexpected conversion overflow in planner for distant date values.Tom Lane
The "date" type supports a wider range of dates than int64 timestamps do. However, there is pre-int64-timestamp code in the planner that assumes that all date values can be converted to timestamp with impunity. Fortunately, what we really need out of the conversion is always a double (float8) value; so even when the date is out of timestamp's range it's possible to produce a sane answer. All we need is a code path that doesn't try to force the result into int64. Per trouble report from David Rericha. Back-patch to all supported versions. Although this is surely a corner case, there's not much point in advertising a date range wider than timestamp's if we will choke on such values in unexpected places.
2010-12-19Fix up handling of simple-form CASE with constant test expression.Tom Lane
eval_const_expressions() can replace CaseTestExprs with constants when the surrounding CASE's test expression is a constant. This confuses ruleutils.c's heuristic for deparsing simple-form CASEs, leading to Assert failures or "unexpected CASE WHEN clause" errors. I had put in a hack solution for that years ago (see commit 514ce7a331c5bea8e55b106d624e55732a002295 of 2006-10-01), but bug #5794 from Peter Speck shows that that solution failed to cover all cases. Fortunately, there's a much better way, which came to me upon reflecting that Peter's "CASE TRUE WHEN" seemed pretty redundant: we can "simplify" the simple-form CASE to the general form of CASE, by simply omitting the constant test expression from the rebuilt CASE construct. This is intuitively valid because there is no need for the executor to evaluate the test expression at runtime; it will never be referenced, because any CaseTestExprs that would have referenced it are now replaced by constants. This won't save a whole lot of cycles, since evaluating a Const is pretty cheap, but a cycle saved is a cycle earned. In any case it beats kluging ruleutils.c still further. So this patch improves const-simplification and reverts the previous change in ruleutils.c. Back-patch to all supported branches. The bug exists in 8.1 too, but it's out of warranty.
2010-12-19Fix erroneous parsing of tsquery input "... & !(subexpression) | ..."Tom Lane
After parsing a parenthesized subexpression, we must pop all pending ANDs and NOTs off the stack, just like the case for a simple operand. Per bug #5793. Also fix clones of this routine in contrib/intarray and contrib/ltree, where input of types query_int and ltxtquery had the same problem. Back-patch to all supported versions.
2010-11-10Fix line_construct_pm() for the case of "infinite" (DBL_MAX) slope.Tom Lane
This code was just plain wrong: what you got was not a line through the given point but a line almost indistinguishable from the Y-axis, although not truly vertical. The only caller that tries to use this function with m == DBL_MAX is dist_ps_internal for the case where the lseg is horizontal; it would end up producing the distance from the given point to the place where the lseg's line crosses the Y-axis. That function is used by other operators too, so there are several operators that could compute wrong distances from a line segment to something else. Per bug #5745 from jindiax. Back-patch to all supported branches.
2010-11-02Ensure an index that uses a whole-row Var still depends on its table.Tom Lane
We failed to record any dependency on the underlying table for an index declared like "create index i on t (foo(t.*))". This would create trouble if the table were dropped without previously dropping the index. To fix, simplify some overly-cute code in index_create(), accepting the possibility that sometimes the whole-table dependency will be redundant. Also document this hazard in dependency.c. Per report from Kevin Grittner. In passing, prevent a core dump in pg_get_indexdef() if the index's table can't be found. I came across this while experimenting with Kevin's example. Not sure it's a real issue when the catalogs aren't corrupt, but might as well be cautious. Back-patch to all supported versions.
2010-09-22Re-allow input of Julian dates prior to 0001-01-01 AD.Tom Lane
This was unintentionally broken in 8.4 while tightening up checking of ordinary non-Julian date inputs to forbid references to "year zero". Per bug #5672 from Benjamin Gigot.
2010-08-03Fix core dump in QTNodeCompare when tsquery_cmp() is applied to two emptyTom Lane
tsqueries. CompareTSQ has to have a guard for the case rather than blindly applying QTNodeCompare to random data past the end of the datums. Also, change QTNodeCompare to be a little less trusting: use an actual test rather than just Assert'ing that the input is sane. Problem encountered while investigating another issue (I saw a core dump in autoanalyze on a table containing multiple empty tsquery values). Back-patch to all branches with tsquery support. In HEAD, also fix some bizarre (though not outright wrong) coding in tsq_mcontains().
2010-05-28Rewrite LIKE's %-followed-by-_ optimization so it really works (this timeTom Lane
for sure ;-)). It now also optimizes more cases, such as %_%_. Improve comments too. Per bug #5478. In passing, also rename the TCHAR macro to GETCHAR, because pgindent is messing with the formatting of the former (apparently it now thinks TCHAR is a typedef name). Back-patch to 8.3, where the bug was introduced.
2010-03-03Export xml.c's libxml-error-handling support so that contrib/xml2 can use itTom Lane
too, instead of duplicating the functionality (badly). I renamed xml_init to pg_xml_init, because the former seemed just a bit too generic to be safe as a global symbol. I considered likewise renaming xml_ereport to pg_xml_ereport, but felt that the reference to ereport probably made it sufficiently PG-centric already.
2010-02-18Provide some rather hokey ways for EXPLAIN to print FieldStore and assignmentTom Lane
ArrayRef expressions that are not in the immediate context of an INSERT or UPDATE targetlist. Such cases never arise in stored rules, so ruleutils.c hadn't tried to handle them. However, they do occur in the targetlists of plans derived from such statements, and now that EXPLAIN VERBOSE tries to print targetlists, we need some way to deal with the case. I chose to represent an assignment ArrayRef as "array[subscripts] := source", which is fairly reasonable and doesn't omit any information. However, FieldStore is problematic because the planner will fold multiple assignments to fields of the same composite column into one FieldStore, resulting in a structure that is hard to understand at all, let alone display comprehensibly. So in that case I punted and just made it print the source expression(s). Backpatch to 8.4 --- the lack of functionality exists in older releases, but doesn't seem to be important for lack of anything that would call it.
2010-01-23Insert CHECK_FOR_INTERRUPTS calls into loops in dbsize.c, to ensure thatTom Lane
the various disk-size-reporting functions will respond to query cancel reasonably promptly even in very large databases. Per report from Kevin Grittner.
2010-01-07Make bit/varbit substring() treat any negative length as meaning "all the restTom Lane
of the string". The previous coding treated only -1 that way, and would produce an invalid result value for other negative values. We ought to fix it so that 2-parameter bit substring() is a different C function and the 3-parameter form throws error for negative length, but that takes a pg_proc change which is impractical in the back branches; and in any case somebody might be relying on -1 working this way. So just do this as a back-patchable fix.
2009-12-12Fix integer-to-bit-string conversions to handle the first fractional byteTom Lane
correctly when the output bit width is wider than the given integer by something other than a multiple of 8 bits. This has been wrong since I first wrote that code for 8.0 :-(. Kudos to Roman Kononov for being the first to notice, though I didn't use his patch. Per bug #5237.
2009-12-09Prevent indirect security attacks via changing session-local state withinTom Lane
an allegedly immutable index function. It was previously recognized that we had to prevent such a function from executing SET/RESET ROLE/SESSION AUTHORIZATION, or it could trivially obtain the privileges of the session user. However, since there is in general no privilege checking for changes of session-local state, it is also possible for such a function to change settings in a way that might subvert later operations in the same session. Examples include changing search_path to cause an unexpected function to be called, or replacing an existing prepared statement with another one that will execute a function of the attacker's choosing. The present patch secures VACUUM, ANALYZE, and CREATE INDEX/REINDEX against these threats, which are the same places previously deemed to need protection against the SET ROLE issue. GUC changes are still allowed, since there are many useful cases for that, but we prevent security problems by forcing a rollback of any GUC change after completing the operation. Other cases are handled by throwing an error if any change is attempted; these include temp table creation, closing a cursor, and creating or deleting a prepared statement. (In 7.4, the infrastructure to roll back GUC changes doesn't exist, so we settle for rejecting changes of "search_path" in these contexts.) Original report and patch by Gurjeet Singh, additional analysis by Tom Lane. Security: CVE-2009-4136
2009-11-20Fix display and dumping of UPDATE OR TRUNCATE triggers (a bizarre combinationTom Lane
maybe, but we should get it right). Bug noted while reviewing TRIGGER WHEN patch. Already fixed in HEAD.
2009-11-05Allow binary-coercible cases in ri_HashCompareOp; there are some such casesTom Lane
that are not handled by find_coercion_pathway, notably composite->RECORD. Now that 8.4 supports composites as primary keys, it's worth dealing with this case.
2009-10-13Fix ts_stat's failure on empty tsvector.Tom Lane
Also insert a couple of Asserts that check for stack overflow. Bogus coding appears to be new in 8.4 --- older releases had a much simpler algorithm here. Per bug #5111.
2009-10-08Fix off-by-one bug in bitncmp(): When comparing a number of bits divisible byHeikki Linnakangas
8, bitncmp() may dereference a pointer one byte out of bounds. Chris Mikkelson (bug #5101)
2009-09-04Fix encoding handling in xml binary input function. If the XML header didn'tHeikki Linnakangas
specify an encoding explicitly, we used to treat it as being in database encoding when we parsed it, but then perform a UTF-8 -> database encoding conversion on it, which was completely bogus. It's now consistently treated as UTF-8.
2009-09-03Install a workaround for a longstanding gcc bug that allows SIGFPE trapsTom Lane
to occur for division by zero, even though the code is carefully avoiding that. All available evidence is that the only functions affected are int24div, int48div, and int28div, so patch just those three functions to include a "return" after the ereport() call. Backpatch to 8.4 so that the fix can be tested in production builds. For older branches our recommendation will continue to be to use -O1 on affected platforms (which are mostly non-mainstream anyway).
2009-08-30Remove duplicate variable initializations identified by clang static checker.Tom Lane
One of these represents a nontrivial bug (a promptly-leaked palloc), so backpatch. Greg Stark
2009-08-18Fix overflow for INTERVAL 'x ms' where x is more than a couple million,Tom Lane
and integer datetimes are in use. Per bug report from Hubert Depesz Lubaczewski. Alex Hunsaker
2009-07-29Fix time_part and timetz_part (ie, EXTRACT() for those datatypes) toTom Lane
include a fractional part in the output for MILLISECOND and SECOND cases, rather than truncating the source value. This is what the float-timestamp code has always done, and it was clearly the code author's intent to do the same for integer timestamps, but he forgot about integer division in C. The other datatypes supported by EXTRACT() already do this correctly. Backpatch to 8.4, so that the default (integer) behavior of that branch will match the default (float) behavior of older branches. Arguably we should patch further back, but it's possible that applications are expecting the broken behavior in older branches. 8.4 is new enough that expectations shouldn't be too settled. Per report from Greg Stark.
2009-07-28Fix incorrect cleanup of tsquery in ts_rewrite(). Per bug #4933 byTeodor Sigaev
Aaron Marcuse-Kubitza <aaronmk@blackducksoftware.com>
2009-07-23In a non-hashed Agg node, reset the "aggcontext" at group boundaries, insteadTom Lane
of individually pfree'ing pass-by-reference transition values. This should be at least as fast as the prior coding, and it has the major advantage of clearing out any working data an aggregate function may have stored in or underneath the aggcontext. This avoids memory leakage when an aggregate such as array_agg() is used in GROUP BY mode. Per report from Chris Spotts. Back-patch to 8.4. In principle the problem could arise in prior versions, but since they didn't have array_agg the issue seems not critical.
2009-07-06Fix ancient bug in handling of to_char modifier 'TH', when used with HH.Heikki Linnakangas
In what seems like an oversight, we used to treat 'TH' the same as lowercase 'th', but only with HH/HH12.
2009-06-23Fix an ancient error in dist_ps (distance from point to line segment), whichTom Lane
a number of other geometric operators also depend on. It miscalculated the slope of the perpendicular to the given line segment anytime that slope was other than 0, infinite, or +/-1. In some cases the error would be masked because the true closest point on the line segment was one of its endpoints rather than the intersection point, but in other cases it could give an arbitrarily bad answer. Per bug #4872 from Nick Roosevelt. Bug goes clear back to Berkeley days, so patch all supported branches. Make a couple of cosmetic adjustments while at it.
2009-06-22Make to_timestamp and friends skip leading spaces before an integer field,Tom Lane
even when not in FM mode. This improves compatibility with Oracle and with our pre-8.4 behavior, as per bug #4862. Brendan Jurd Add a couple of regression test cases for this. In passing, get rid of the labeling of the individual test cases; doesn't seem to be good for anything except causing extra work when inserting a test... Tom Lane
2009-06-22Revert dubious message wording change.Tom Lane
2009-06-21Message fixesPeter Eisentraut
2009-06-20Fix things so that array_agg_finalfn does not modify or free its inputTom Lane
ArrayBuildState, per trouble report from Merlin Moncure. By adopting this fix, we are essentially deciding that aggregate final-functions should not modify their inputs ever. Adjust documentation and comments to match that conclusion.
2009-06-118.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef listBruce Momjian
provided by Andrew.