summaryrefslogtreecommitdiff
path: root/src/backend/utils/adt
AgeCommit message (Collapse)Author
2008-10-02Fix improper display of fractional seconds in interval valuesTom Lane
when using --enable-integer-datetimes and a non-ISO datestyle. Ron Mayer
2008-07-07Fix estimate_num_groups() to assume that GROUP BY expressions yielding booleanTom Lane
results always contribute two groups, regardless of the expression contents. This is very substantially more accurate than the regular heuristic for certain boolean tests like "col IS NULL". Per gripe from Sam Mason. Back-patch to all supported releases, since the behavior of estimate_num_groups() hasn't changed all that much since 7.4.
2008-07-07Fix AT TIME ZONE (in all three variants) so that we first try to interpretTom Lane
the timezone argument as a timezone abbreviation, and only try it as a full timezone name if that fails. The zic database has four zones (CET, EET, MET, WET) that are full daylight-savings zones and yet have names that are the same as their abbreviations for standard time, resulting in ambiguity. In the timestamp input functions we resolve the ambiguity by preferring the abbreviation, and AT TIME ZONE should work the same way. (No functionality is lost because the zic database also has other names for these zones, eg Europe/Zurich.) Per gripe from Jaromir Talir. Backpatch to 8.1. Older releases did not have the issue because AT TIME ZONE only accepted abbreviations not zone names. (Thus, this patch also arguably fixes a compatibility botch introduced at 8.1: in ambiguous cases we now behave the same as 8.0 did.)
2008-06-09Fix datetime input functions to correctly detect integer overflow whenTom Lane
running on a 64-bit platform ... strtol() will happily return 64-bit output in that case. Per bug #4231 from Geoff Tolley.
2008-06-06Fix pg_get_ruledef() so that negative numeric constants are parenthesized.Tom Lane
This is needed because :: casting binds more tightly than minus, so for example -1::integer is not the same as (-1)::integer, and there are cases where the difference is important. In particular this caused a failure in SELECT DISTINCT ... ORDER BY ... where expressions that should have matched were seen as different by the parser; but I suspect that there could be other cases where failure to parenthesize leads to subtler semantic differences in reloaded rules. Per report from Alexandr Popov.
2008-05-28Backpatch Zdenek Kotala's fix to prevent pglz_decompress from stomping onTom Lane
memory if the compressed data is corrupt. Backpatch as far as 8.2. The issue exists in older branches too, but given the lack of field reports, it's not clear it's worth any additional effort to adapt the patch to the slightly different code in older branches.
2008-05-03The 8.2 patch that added support for an alias on the target table ofTom Lane
UPDATE/DELETE forgot to teach ruleutils.c to display the alias. Per bug #4141 from Mathias Seiler.
2008-04-11Fix several datatype input functions that were allowing unused bytes in theirTom Lane
results to contain uninitialized, unpredictable values. While this was okay as far as the datatypes themselves were concerned, it's a problem for the parser because occurrences of the "same" literal might not be recognized as equal by datumIsEqual (and hence not by equal()). It seems sufficient to fix this in the input functions since the only critical use of equal() is in the parser's comparisons of ORDER BY and DISTINCT expressions. Per a trouble report from Marc Cousin. Patch all the way back. Interestingly, array_in did not have the bug before 8.2, which may explain why the issue went unnoticed for so long.
2008-03-31Fix a number of places that were making file-type tests infelicitously.Tom Lane
The places that did, eg, (statbuf.st_mode & S_IFMT) == S_IFDIR were correct, but there is no good reason not to use S_ISDIR() instead, especially when that's what the other 90% of our code does. The places that did, eg, (statbuf.st_mode & S_IFDIR) were flat out *wrong* and would fail in various platform-specific ways, eg a symlink could be mistaken for a regular file on most Unixen. The actual impact of this is probably small, since the problem cases seem to always involve symlinks or sockets, which are unlikely to be found in the directories that PG code might be scanning. But it's clearly trouble waiting to happen, so patch all the way back anyway. (There seem to be no occurrences of the mistake in 7.4.)
2008-03-19Fix regexp substring matching (substring(string from pattern)) for the cornerTom Lane
case where there is a match to the pattern overall but the user has specified a parenthesized subexpression and that subexpression hasn't got a match. An example is substring('foo' from 'foo(bar)?'). This should return NULL, since (bar) isn't matched, but it was mistakenly returning the whole-pattern match instead (ie, 'foo'). Per bug #4044 from Rui Martins. This has been broken since the beginning; patch in all supported versions. The old behavior was sufficiently inconsistent that it's impossible to believe anyone is depending on it.
2008-03-13Fix varstr_cmp's special case for UTF8 encoding on Windows so that stringsTom Lane
that are reported as "equal" by wcscoll() are checked to see if they really are bitwise equal, and are sorted per strcmp() if not. We made this happen a couple of years ago in the regular code path, but it unaccountably got left out of the Windows/UTF8 case (probably brain fade on my part at the time). As in the prior set of changes, affected users may need to reindex indexes on textual columns. Backpatch as far as 8.2, which is the oldest release we are still supporting on Windows.
2008-02-25Fix datetime input to behave correctly for Feb 29 in years BC.Tom Lane
Formerly, DecodeDate attempted to verify the day-of-the-month exactly, but it was under the misapprehension that it would know whether we were looking at a BC year or not. In reality this check can't be made until the calling function (eg DecodeDateTime) has processed all the fields. So, split the BC adjustment and validity checks out into a new function ValidateDate that is called only after processing all the fields. In passing, this patch makes DecodeTimeOnly work for BC inputs, which it never did before. (The historical veracity of all this is nonexistent, of course, but if we're going to say we support proleptic Gregorian calendar then we should do it correctly. In any case the unpatched code is broken because it could emit dates that it would then reject on re-inputting.) Per report from Bernd Helmle. Back-patch as far as 8.0; in 7.x we were not using our own calendar support and so this seems a bit too risky to put into 7.4.
2008-01-06A long time ago, Peter pointed out that ruleutils.c didn't dump simpleTom Lane
constant ORDER/GROUP BY entries properly: http://archives.postgresql.org/pgsql-hackers/2001-04/msg00457.php The original solution to that was in fact no good, as demonstrated by today's report from Martin Pitt: http://archives.postgresql.org/pgsql-bugs/2008-01/msg00027.php We can't use the column-number-reference format for a constant that is a resjunk targetlist entry, a case that was unfortunately not thought of in the original discussion. What we can do instead (which did not work at the time, but does work in 7.3 and up) is to emit the constant with explicit ::typename decoration, even if it otherwise wouldn't need it. This is sufficient to keep the parser from thinking it's a column number reference, and indeed is probably what the user must have done to get such a thing into the querytree in the first place.
2008-01-03Make standard maintenance operations (including VACUUM, ANALYZE, REINDEX,Tom Lane
and CLUSTER) execute as the table owner rather than the calling user, using the same privilege-switching mechanism already used for SECURITY DEFINER functions. The purpose of this change is to ensure that user-defined functions used in index definitions cannot acquire the privileges of a superuser account that is performing routine maintenance. While a function used in an index is supposed to be IMMUTABLE and thus not able to do anything very interesting, there are several easy ways around that restriction; and even if we could plug them all, there would remain a risk of reading sensitive information and broadcasting it through a covert channel such as CPU usage. To prevent bypassing this security measure, execution of SET SESSION AUTHORIZATION and SET ROLE is now forbidden within a SECURITY DEFINER context. Thanks to Itagaki Takahiro for reporting this vulnerability. Security: CVE-2007-6600
2007-12-18Make path_recv() and poly_recv() reject paths/polygons containing no points.Tom Lane
The zero-point case is sensible so far as the data structure is concerned, so maybe we ought to allow it sometime; but right now the textual input routines for these types don't allow it, and it seems that not all the functions for the types are prepared to cope. Report and patch by Merlin Moncure.
2007-11-09Second pass at improving LIKE/regex estimation in non-C locales. It turnsTom Lane
out that it's actually quite likely that a string that is an extension of the given prefix will sort as larger than the "greater" string our previous code created. To provide some defense against that, do the comparisons against a modified string instead of just the bare prefix. We tack on "Z", "z", "y", or "9", whichever is seen as largest in the current locale. Testing suggests that this is sufficient at least for cases involving ASCII data.
2007-11-07Improve the performance of LIKE/regex estimation in non-C locales, by makingTom Lane
make_greater_string() try harder to generate a string that's actually greater than its input string. Before we just assumed that making a string that was memcmp-greater was enough, but it is easy to generate examples where this is not so when the locale is not C. Instead, loop until the relevant comparison function agrees that the generated string is greater than the input. Unfortunately this is probably not enough to guarantee that the generated string is greater than all extensions of the input, so we cannot relax the restriction to C locale for the LIKE/regex index optimization. But it should at least improve the odds of getting a useful selectivity estimate in prefix_selectivity(). Per example from Guillaume Smet. Backpatch to 8.1, mainly because that's what the complainant is using...
2007-11-07Fix patternsel() and callers to do the right thing for NOT LIKE and the otherTom Lane
negated-match operators. patternsel had been using the supplied operator as though it were a positive-match operator, and thus obtaining a wrong result, which was even more wrong after the caller subtracted it from 1. Seems cleanest to give patternsel an explicit "negate" argument so that it knows what's going on. Also install the same factorization scheme for pattern join selectivity estimators; even though they are just stubs at the moment, this may keep someone from making the same type of mistake when they get filled out. Per report from Greg Mullane. Backpatch to 8.2 --- previous releases do not show the problem because patternsel() doesn't actually use the operator directly.
2007-10-13Fix ALTER COLUMN TYPE to preserve the tablespace and reloptions of indexesTom Lane
it affects. The original coding neglected tablespace entirely (causing the indexes to move to the database's default tablespace) and for an index belonging to a UNIQUE or PRIMARY KEY constraint, it would actually try to assign the parent table's reloptions to the index :-(. Per bug #3672 and subsequent investigation. 8.0 and 8.1 did not have reloptions, but the tablespace bug is present.
2007-09-22Fix bogus calculation of potential output string length in translate().Tom Lane
2007-09-19Prevent corr() from returning the wrong results for negative correlationNeil Conway
values. The previous coding essentially assumed that x = sqrt(x*x), which does not hold for x < 0. Thanks to Jie Zhang at Greenplum and Gavin Sherry for reporting this issue.
2007-09-16Fix overflow in extract(epoch from interval) for intervals exceeding 68 years.Tom Lane
Seems to have been introduced in 8.1 by careless SECS_PER_DAY search-and-replace.
2007-08-31Apply a band-aid fix for the problem that 8.2 and up completely misestimateTom Lane
the number of rows likely to be produced by a query such as SELECT * FROM t1 LEFT JOIN t2 USING (key) WHERE t2.key IS NULL; What this is doing is selecting for t1 rows with no match in t2, and thus it may produce a significant number of rows even if the t2.key table column contains no nulls at all. 8.2 thinks the table column's null fraction is relevant and thus may estimate no rows out, which results in terrible plans if there are more joins above this one. A proper fix for this will involve passing much more information about the context of a clause to the selectivity estimator functions than we ever have. There's no time left to write such a patch for 8.3, and it wouldn't be back-patchable into 8.2 anyway. Instead, put in an ad-hoc test to defeat the normal table-stats-based estimation when an IS NULL test is evaluated at an outer join, and just use a constant estimate instead --- I went with 0.5 for lack of a better idea. This won't catch every case but it will catch the typical ways of writing such queries, and it seems unlikely to make things worse for other queries.
2007-08-21Fix potential access-off-the-end-of-memory in varbit_out(): it fetched theTom Lane
byte after the last full byte of the bit array, regardless of whether that byte was part of the valid data or not. Found by buildfarm testing. Thanks to Stefan Kaltenbrunner for nailing down the cause.
2007-08-15Repair problems occurring when multiple RI updates have to be done to the sameTom Lane
row within one query: we were firing check triggers before all the updates were done, leading to bogus failures. Fix by making the triggers queued by an RI update go at the end of the outer query's trigger event list, thereby effectively making the processing "breadth-first". This was indeed how it worked pre-8.0, so the bug does not occur in the 7.x branches. Per report from Pavel Stehule.
2007-07-19Make replace(), split_part(), and string_to_array() behave somewhat sanelyTom Lane
when handed an invalidly-encoded pattern. The previous coding could get into an infinite loop if pg_mb2wchar_with_len() returned a zero-length string after we'd tested for nonempty pattern; which is exactly what it will do if the string consists only of an incomplete multibyte character. This led to either an out-of-memory error or a backend crash depending on platform. Per report from Wiktor Wodecki.
2007-07-09Fix stddev_pop(numeric) and var_pop(numeric), which were incorrectly producingTom Lane
the same outputs as stddev_samp() and var_samp() respectively.
2007-06-29Fix a passel of ancient bugs in to_char(), including two distinct bufferTom Lane
overruns (neither of which seem likely to be exploitable as security holes, fortunately, since the provoker can't control the data written). One of these is due to choosing to stomp on the output of a called function, which is bad news in any case; make it treat the called functions' results as read-only. Avoid some unnecessary palloc/pfree traffic too; it's not really helpful to free small temporary objects, and again this is presuming more than it ought to about the nature of the results of called functions. Per report from Patrick Welche and additional code-reading by Imad.
2007-06-12Fix DecodeDateTime to allow timezone to appear before year. This hadTom Lane
historically worked in some but not all cases, but as of 8.2 it failed for all timezone formats. Fix, and add regression test cases to catch future regressions in this area. Per gripe from Adam Witney.
2007-06-09Allow numeric_fac() to be interrupted, since it can take quite a while forTom Lane
large inputs. Also cause it to error out immediately if the result will overflow, instead of grinding through a lot of calculation first. Per gripe from Jim Nasby.
2007-06-02Fix erroneous error reporting for overlength input in text_date(),Tom Lane
text_time(), and text_timetz(). 7.4-vintage bug found by Greg Stark.
2007-05-29Fix a bug in input processing for the "interval" type. Previously,Neil Conway
"microsecond" and "millisecond" units were not considered valid input by themselves, which caused inputs like "1 millisecond" to be rejected erroneously. Update the docs, add regression tests, and backport to 8.2 and 8.1
2007-05-17Temporary fix for the problem that pg_stat_activity, inet_client_addr(),Tom Lane
and inet_server_addr() fail if the client connected over a "scoped" IPv6 address. In this case getnameinfo() will return a string ending with a poorly-standardized "%something" zone specifier, which these functions try to feed to network_in(), which won't take it. So that we don't lose functionality altogether, suppress the zone specifier before giving the string to network_in(). Per report from Brian Hirt. TODO: probably someday the inet type should support scoped IPv6 addresses, and then this patch should be reverted. Backpatch to 8.2 ... is it worth going further?
2007-05-05Check return code from strxfrm on Windows since it has aMagnus Hagander
non-standard way of indicating errors, so we don't try to allocate INT_MAX bytes to store a result in.
2007-03-11Fix a race condition that caused pg_database_size() and pg_tablespace_size()Alvaro Herrera
to fail if an object was removed between calls to ReadDir() and stat(). Per discussion in pgsql-hackers. http://archives.postgresql.org/pgsql-hackers/2007-03/msg00671.php Bug report and patch by Michael Fuhr.
2007-02-08Fix bug when localized to_char() day or month names were incorectlyBruce Momjian
trnasformed to lower or upper string. Backpatch to 8.2.X. Pavel Stehule
2007-01-30Clarify paramater handling for pg_get_serial_sequence().Bruce Momjian
2007-01-28Dept of second thoughts: the IQ of estimate_array_length() needs to beTom Lane
kept on par with that of scalararraysel(), else estimates that should track might not. Hence teach it about binary-compatible cases, too.
2007-01-28Fix scalararraysel() to cope with binary-compatible cases, such as text[]Tom Lane
versus varchar[]. This oversight probably explains Ryan Holmes' recent complaint --- he was getting a generic selectivity estimate instead of anything intelligent.
2007-01-25Properly detoast access to bytea field pg_trigger.tgargs. Old codeBruce Momjian
might cause server crash. Backpatch to 8.2.X.
2007-01-12Fix handling of CC (century) format spec in to_date/to_char. According toTom Lane
standard convention the 21st century runs from 2001-2100, not 2000-2099, so make it work like that. Per bug #2885 from Akio Iwaasa. Backpatch to 8.2, but no further, since this is really a definitional change; users of older branches are probably more interested in stability.
2007-01-03Fix regex_fixed_prefix() to cope reasonably well with regex patterns of theTom Lane
form '^(foo)$'. Before, these could never be optimized into indexscans. The recent changes to make psql and pg_dump generate such patterns (for \d commands and -t and related switches, respectively) therefore represented a big performance hit for people with large pg_class catalogs, as seen in recent gripe from Erik Jones. While at it, be more paranoid about case-sensitivity checking in multibyte encodings, and fix some other corner cases in which a regex might be interpreted too liberally.
2006-12-15Fix some planner bugs exposed by reports from Arjen van der Meijden. TheseTom Lane
are all in new-in-8.2 logic associated with indexability of ScalarArrayOpExpr (IN-clauses) or amortization of indexscan costs across repeated indexscans on the inside of a nestloop. In particular: Fix some logic errors in the estimation for multiple scans induced by a ScalarArrayOpExpr indexqual. Include a small cost component in bitmap index scans to reflect the costs of manipulating the bitmap itself; this is mainly to prevent a bitmap scan from appearing to have the same cost as a plain indexscan for fetching a single tuple. Also add a per-index-scan-startup CPU cost component; while prior releases were clearly too pessimistic about the cost of repeated indexscans, the original 8.2 coding allowed the cost of an indexscan to effectively go to zero if repeated often enough, which is overly optimistic. Pay some attention to index correlation when estimating costs for a nestloop inner indexscan: this is significant when the plan fetches multiple heap tuples per iteration, since high correlation means those tuples are probably on the same or adjacent heap pages.
2006-11-28Add workaround for localizing May and abbreviated May differently. IdeaPeter Eisentraut
of Dennis Björklund.
2006-11-24Revert (too late in beta):Bruce Momjian
Fix to_char() locale handling to honor LC_TIME, not LC_MESSAGES. Euler Taveira de Oliveira
2006-11-24Change pg_stat_all_tables and sister views to put the recently-addedTom Lane
vacuum/analyze timestamp columns at the end, rather than at a random spot in the middle as in the original patch. This was deemed more usable as well as less likely to break existing application code. initdb forced accordingly. In passing, remove former kluge for initializing pg_stat_file()'s pg_proc entry --- bootstrap mode was fixed recently so that this can be done without any hacks, but I overlooked this usage.
2006-11-24Fix to_char() locale handling to honor LC_TIME, not LC_MESSAGES.Bruce Momjian
Euler Taveira de Oliveira
2006-11-21On systems that have setsid(2) (which should be just about everything exceptTom Lane
Windows), arrange for each postmaster child process to be its own process group leader, and deliver signals SIGINT, SIGTERM, SIGQUIT to the whole process group not only the direct child process. This provides saner behavior for archive and recovery scripts; in particular, it's possible to shut down a warm-standby recovery server using "pg_ctl stop -m immediate", since delivery of SIGQUIT to the startup subprocess will result in killing the waiting recovery_command. Also, this makes Query Cancel and statement_timeout apply to scripts being run from backends via system(). (There is no support in the core backend for that, but it's widely done using untrusted PLs.) Per gripe from Stephen Harris and subsequent discussion.
2006-11-11Suppress a few 'uninitialized variable' warnings that gcc emits only atTom Lane
-O3 or higher (presumably because it inlines more things). Per gripe from Mark Mielke.
2006-11-10Fix pg_get_serial_sequence(), which could incorrectly return the nameTom Lane
of an index on a serial column, rather than the name of the associated sequence. Fallout from recent changes in dependency setup for serials. Per bug #2732 from Basil Evseenko.