summaryrefslogtreecommitdiff
path: root/src/backend/utils/adt
AgeCommit message (Collapse)Author
2008-04-04Re-implement division for numeric values using the traditional "schoolbook"Tom Lane
algorithm. This is a good deal slower than our old roundoff-error-prone code for long inputs, so we keep the old code for use in the transcendental functions, where everything is approximate anyway. Also create a user-accessible function div(numeric, numeric) to provide access to the exact result of trunc(x/y) --- since the regular numeric / operator will round off its result, simply computing that expression in SQL doesn't reliably give the desired answer. This fixes bug #3387 and various related corner cases, and improves the usefulness of PG for high-precision integer arithmetic.
2008-04-04Implement current_query(), that shows the currently executing query.Bruce Momjian
At the same time remove dblink/dblink_current_query() as it is no longer necessary *BACKWARD COMPATIBILITY ISSUE* for dblink Tomas Doran
2008-04-04Turn xmlbinary and xmloption GUC variables into enumsTurn xmlbinary andMagnus Hagander
xmloption GUC variables into enums..
2008-04-02Convert three more guc settings to enum type:Magnus Hagander
default_transaction_isolation, session_replication_role and regex_flavor.
2008-03-31Fix a number of places that were making file-type tests infelicitously.Tom Lane
The places that did, eg, (statbuf.st_mode & S_IFMT) == S_IFDIR were correct, but there is no good reason not to use S_ISDIR() instead, especially when that's what the other 90% of our code does. The places that did, eg, (statbuf.st_mode & S_IFDIR) were flat out *wrong* and would fail in various platform-specific ways, eg a symlink could be mistaken for a regular file on most Unixen. The actual impact of this is probably small, since the problem cases seem to always involve symlinks or sockets, which are unlikely to be found in the directories that PG code might be scanning. But it's clearly trouble waiting to happen, so patch all the way back anyway. (There seem to be no occurrences of the mistake in 7.4.)
2008-03-28Support statement-level ON TRUNCATE triggers. Simon RiggsTom Lane
2008-03-26Move the HTSU_Result enum definition into snapshot.h, to avoid includingAlvaro Herrera
tqual.h into heapam.h. This makes all inclusion of tqual.h explicit. I also sorted alphabetically the includes on some source files.
2008-03-26Rename snapmgmt.c/h to snapmgr.c/h, for consistency with other files.Alvaro Herrera
Per complaint from Tom Lane.
2008-03-26Separate snapshot management code from tuple visibility code, create aAlvaro Herrera
snapmgmt.c file for the former. The header files have also been reorganized in three parts: the most basic snapshot definitions are now in a new file snapshot.h, and the also new snapmgmt.h keeps the definitions for snapmgmt.c. tqual.h has been reduced to the bare minimum. This patch is just a first step towards managing live snapshots within a transaction; there is no functionality change. Per my proposal to pgsql-patches on 20080318191940.GB27458@alvh.no-ip.org and subsequent discussion.
2008-03-25Simplify and standardize conversions between TEXT datums and ordinary CTom Lane
strings. This patch introduces four support functions cstring_to_text, cstring_to_text_with_len, text_to_cstring, and text_to_cstring_buffer, and two macros CStringGetTextDatum and TextDatumGetCString. A number of existing macros that provided variants on these themes were removed. Most of the places that need to make such conversions now require just one function or macro call, in place of the multiple notational layers that used to be needed. There are no longer any direct calls of textout or textin, and we got most of the places that were using handmade conversions via memcpy (there may be a few still lurking, though). This commit doesn't make any serious effort to eliminate transient memory leaks caused by detoasting toasted text objects before they reach text_to_cstring. We changed PG_GETARG_TEXT_P to PG_GETARG_TEXT_PP in a few places where it was easy, but much more could be done. Brendan Jurd and Tom Lane
2008-03-24Fix various infelicities that have snuck into usage of errdetail() andTom Lane
friends. Avoid double translation of some messages, ensure other messages are exposed for translation (and make them follow the style guidelines), avoid unsafe passing of an unpredictable message text as a format string.
2008-03-23Create a function quote_nullable(), which works the same as quote_literal()Tom Lane
except that it returns the string 'NULL', rather than a SQL null, when called with a null argument. This is often a much more useful behavior for constructing dynamic queries. Add more discussion to the documentation about how to use these functions. Brendan Jurd
2008-03-22Refactor to_char/to_date formatting code; primarily, replace DCH_processorTom Lane
with two new functions DCH_to_char and DCH_from_char that have less confusing APIs. Brendan Jurd
2008-03-21Get rid of a bunch of #ifdef HAVE_INT64_TIMESTAMP conditionals by inventingTom Lane
a new typedef TimeOffset to represent an intermediate time value. It's either int64 or double as appropriate, and in most usages will be measured in microseconds or seconds the same as Timestamp. We don't call it Timestamp, though, since the value doesn't necessarily represent an absolute time instant. Warren Turkal
2008-03-19Fix regexp substring matching (substring(string from pattern)) for the cornerTom Lane
case where there is a match to the pattern overall but the user has specified a parenthesized subexpression and that subexpression hasn't got a match. An example is substring('foo' from 'foo(bar)?'). This should return NULL, since (bar) isn't matched, but it was mistakenly returning the whole-pattern match instead (ie, 'foo'). Per bug #4044 from Rui Martins. This has been broken since the beginning; patch in all supported versions. The old behavior was sufficiently inconsistent that it's impossible to believe anyone is depending on it.
2008-03-17Revert thinko introduced into prefix_selectivity() by my recent patch:Tom Lane
make_greater_string needs the < procedure not the >= one. Spotted by Peter.
2008-03-13Fix varstr_cmp's special case for UTF8 encoding on Windows so that stringsTom Lane
that are reported as "equal" by wcscoll() are checked to see if they really are bitwise equal, and are sorted per strcmp() if not. We made this happen a couple of years ago in the regular code path, but it unaccountably got left out of the Windows/UTF8 case (probably brain fade on my part at the time). As in the prior set of changes, affected users may need to reindex indexes on textual columns. Backpatch as far as 8.2, which is the oldest release we are still supporting on Windows.
2008-03-10Fix unportable coding of new error message, per Kris Jurka.Tom Lane
2008-03-10Document and enforce that the usable range of setseed() arguments isTom Lane
-1 to 1, not 0 to 1. The actual behavior for values within this range does not change. Kris Jurka
2008-03-09Change patternsel() so that instead of switching from a pureTom Lane
pattern-examination heuristic method to purely histogram-driven selectivity at histogram size 100, we compute both estimates and use a weighted average. The weight put on the heuristic estimate decreases linearly with histogram size, dropping to zero for 100 or more histogram entries. Likewise in ltreeparentsel(). After a patch by Greg Stark, though I reorganized the logic a bit to give the caller of histogram_selectivity() more control.
2008-03-08Modify prefix_selectivity() so that it will never estimate the selectivityTom Lane
of the generated range condition var >= 'foo' AND var < 'fop' as being less than what eqsel() would estimate for var = 'foo'. This is intuitively reasonable and it gets rid of the need for some entirely ad-hoc coding we formerly used to reject bogus estimates. The basic problem here is that if the prefix is more than a few characters long, the two boundary values are too close together to be distinguishable by comparison to the column histogram, resulting in a selectivity estimate of zero, which is often not very sane. Change motivated by an example from Peter Eisentraut. Arguably this is a bug fix, but I'll refrain from back-patching it for the moment.
2008-03-08Improve pglz_decompress() so that it cannot clobber memory beyond theTom Lane
available output buffer when presented with corrupt input. Some testing suggests that this slows the decompression loop about 1%, which seems an acceptable price to pay for more robustness. (Curiously, the penalty seems to be *less* on not-very-compressible data, which I didn't expect since the overhead per output byte ought to be more in the literal-bytes path.) Patch from Zdenek Kotala. I fixed a corner case and did some renaming of variables to make the routine more readable.
2008-03-07This patch addresses some issues in TOAST compression strategy thatTom Lane
were discussed last year, but we felt it was too late in the 8.3 cycle to change the code immediately. Specifically, the patch: * Reduces the minimum datum size to be considered for compression from 256 to 32 bytes, as suggested by Greg Stark. * Increases the required compression rate for compressed storage from 20% to 25%, again per Greg's suggestion. * Replaces force_input_size (size above which compression is forced) with a maximum size to be considered for compression. It was agreed that allowing large inputs to escape the minimum-compression-rate requirement was not bright, and that indeed we'd rather have a knob that acted in the other direction. I set this value to 1MB for the moment, but it could use some performance studies to tune it. * Adds an early-failure path to the compressor as suggested by Jan: if it's been unable to find even one compressible substring in the first 1KB (parameterizable), assume we're looking at incompressible input and give up. (Possibly this logic can be improved, but I'll commit it as-is for now.) * Improves the toasting heuristics so that when we have very large fields with attstorage 'x' or 'e', we will push those out to toast storage before considering inline compression of shorter fields. This also responds to a suggestion of Greg's, though my original proposal for a solution was a bit off base because it didn't fix the problem for large 'e' fields. There was some discussion in the earlier threads of exposing some of the compression knobs to users, perhaps even on a per-column basis. I have not done anything about that here. It seems to me that if we are changing around the parameters, we'd better get some experience and be sure we are happy with the design before we set things in stone by providing user-visible knobs.
2008-03-05When text search string is too long, in error message report actual andBruce Momjian
maximum number of bytes allowed.
2008-03-01Fix unportable usages of tolower(). On signed-char machines, it is necessaryTom Lane
to explicitly cast the output back to char before comparing it to a char value, else we get the wrong result for high-bit-set characters. Found by Rolf Jentsch. Also, fix several places where <ctype.h> functions were being called without casting the argument to unsigned char; this is likewise unportable, but we keep making that mistake :-(. These found by buildfarm member salamander, which I will desperately miss if it ever goes belly-up.
2008-03-01Disable the undocumented xmlvalidate() function, which was unintentionallyTom Lane
left in the code though it was not meant to be provided. It represents a security hole because unprivileged users could use it to look at (at least the first line of) any file readable by the backend. Fortunately, this is only possible if the backend was built with XML support, so the damage is at least mitigated; and 8.3 probably hasn't propagated into any security-critical uses yet anyway. Per report from Sergey Burladyan.
2008-02-29Remove long-unused and broken TCL_ARRAYS.Alvaro Herrera
2008-02-26Fix encode(...bytea..., 'escape') so that it converts all high-bit-set byteTom Lane
values into \nnn octal escape sequences. When the database encoding is multibyte this is *necessary* to avoid generating invalidly encoded text. Even in a single-byte encoding, the old behavior seems very hazardous --- consider for example what happens if the text is transferred to another database with a different encoding. Decoding would then yield some other bytea value than what was encoded, which is surely undesirable. Per gripe from Hernan Gonzalez. Backpatch to 8.3, but not further. This is a bit of a judgment call, but I make it on these grounds: pre-8.3 we don't really have much encoding safety anyway because of the convert() function family, and we would also have much higher risk of breaking existing apps that may not be expecting this behavior. 8.3 is still new enough that we can probably get away with making this change in the function's behavior.
2008-02-25Reject year zero during datetime input, except when it's a 2-digit yearTom Lane
(then it means 2000 AD). Formerly we silently interpreted this as 1 BC, which at best is unwarranted familiarity with the implementation. It's barely possible that some app somewhere expects the old behavior, though, so we won't back-patch this into existing release branches.
2008-02-25Fix datetime input to behave correctly for Feb 29 in years BC.Tom Lane
Formerly, DecodeDate attempted to verify the day-of-the-month exactly, but it was under the misapprehension that it would know whether we were looking at a BC year or not. In reality this check can't be made until the calling function (eg DecodeDateTime) has processed all the fields. So, split the BC adjustment and validity checks out into a new function ValidateDate that is called only after processing all the fields. In passing, this patch makes DecodeTimeOnly work for BC inputs, which it never did before. (The historical veracity of all this is nonexistent, of course, but if we're going to say we support proleptic Gregorian calendar then we should do it correctly. In any case the unpatched code is broken because it could emit dates that it would then reject on re-inputting.) Per report from Bernd Helmle. Back-patch as far as 8.0; in 7.x we were not using our own calendar support and so this seems a bit too risky to put into 7.4.
2008-02-19Refactor backend makefiles to remove lots of duplicate codePeter Eisentraut
2008-02-18Remove unnecessary opening of other relation in RI_FKey_keyequal_upd_pkTom Lane
and RI_FKey_keyequal_upd_fk, as well as no-longer-needed calls of ri_BuildQueryKeyFull. Aside from saving a few cycles, this avoids needless deadlock risks when an update is not changing the columns that participate in an RI constraint. Per a gripe from Alexey Nalbat. Back-patch to 8.3. Earlier releases did have a need to open the other relation due to the way in which they retrieved information about the RI constraint, so this problem unfortunately can't easily be improved pre-8.3. Tom Lane and Stephan Szabo
2008-02-17Replace time_t with pg_time_t (same values, but always int64) in on-diskTom Lane
data structures and backend internal APIs. This solves problems we've seen recently with inconsistent layout of pg_control between machines that have 32-bit time_t and those that have already migrated to 64-bit time_t. Also, we can get out from under the problem that Windows' Unix-API emulation is not consistent about the width of time_t. There are a few remaining places where local time_t variables are used to hold the current or recent result of time(NULL). I didn't bother changing these since they do not affect any cross-module APIs and surely all platforms will have 64-bit time_t before overflow becomes an actual risk. time_t should be avoided for anything visible to extension modules, however.
2008-02-07Avoid misbehavior in foreign key checks when casting to a datatype for whichTom Lane
the parser supplies a default typmod that can result in data loss (ie, truncation). Currently that appears to be only CHARACTER and BIT. We can avoid the problem by specifying the type's internal name instead of using SQL-spec syntax. Since the queries generated here are only used internally, there's no need to worry about portability. This problem is new in 8.3; before we just let the parser do whatever it wanted to resolve the operator, but 8.3 is trying to be sure that the semantics of FK checks are consistent. Per report from Harald Fuchs.
2008-01-25Release any detoasted copies of arrays that are made temporarily inTom Lane
ri_FetchConstraintInfo, to avoid a query-duration memory leak when that routine is called by RI_FKey_keyequal_upd_fk (which isn't executed in a short-lived context). This problem was latent when the routine was added in February, but it didn't become serious until the varvarlena patch made it quite likely that the fields being examined would be "toasted" (ie, have short headers). Per report from Stephen Denne.
2008-01-15Revise memory management for libxml calls. Instead of keeping libxml's dataTom Lane
in whichever context happens to be current during a call of an xml.c function, use a dedicated context that will not go away until we explicitly delete it (which we do at transaction end or subtransaction abort). This makes recovery after an error much simpler --- we don't have to individually delete the data structures created by libxml. Also, we need to initialize and cleanup libxml only once per transaction (if there's no error) instead of once per function call, so it should be a bit faster. We'll need to keep an eye out for intra-transaction memory leaks, though. Alvaro and Tom.
2008-01-12It turns out the LIBXML_TEST_VERSION macro calls xmlInitParser().Tom Lane
Therefore we must xmlCleanupParser(), or we risk leaving behind dangling pointers to whatever memory context is current when xml_init() is called. This seems to fix bug #3860, though we might still want the more invasive solution being worked on by Alvaro.
2008-01-12Fix two places in xml.c that neglected to check the return values ofNeil Conway
SPI_prepare() and SPI_cursor_open(), to silence a Coverity warning.
2008-01-12Minor perf tweak for _SPI_strdup(): if we're going to call strlen()Neil Conway
anyway, it is faster to memcpy() than to strcpy().
2008-01-08lmgr.c:DescribeLockTag was never taught about virtual xids, per Greg Stark.Tom Lane
Also a couple of minor tweaks to try to future-proof the code a bit better against future locktag additions.
2008-01-08Remove unnecessary comma in enum definition ... some C compilers don'tTom Lane
like that. Per report from J6M.
2008-01-06A long time ago, Peter pointed out that ruleutils.c didn't dump simpleTom Lane
constant ORDER/GROUP BY entries properly: http://archives.postgresql.org/pgsql-hackers/2001-04/msg00457.php The original solution to that was in fact no good, as demonstrated by today's report from Martin Pitt: http://archives.postgresql.org/pgsql-bugs/2008-01/msg00027.php We can't use the column-number-reference format for a constant that is a resjunk targetlist entry, a case that was unfortunately not thought of in the original discussion. What we can do instead (which did not work at the time, but does work in 7.3 and up) is to emit the constant with explicit ::typename decoration, even if it otherwise wouldn't need it. This is sufficient to keep the parser from thinking it's a column number reference, and indeed is probably what the user must have done to get such a thing into the querytree in the first place.
2008-01-03Make standard maintenance operations (including VACUUM, ANALYZE, REINDEX,Tom Lane
and CLUSTER) execute as the table owner rather than the calling user, using the same privilege-switching mechanism already used for SECURITY DEFINER functions. The purpose of this change is to ensure that user-defined functions used in index definitions cannot acquire the privileges of a superuser account that is performing routine maintenance. While a function used in an index is supposed to be IMMUTABLE and thus not able to do anything very interesting, there are several easy ways around that restriction; and even if we could plug them all, there would remain a risk of reading sensitive information and broadcasting it through a covert channel such as CPU usage. To prevent bypassing this security measure, execution of SET SESSION AUTHORIZATION and SET ROLE is now forbidden within a SECURITY DEFINER context. Thanks to Itagaki Takahiro for reporting this vulnerability. Security: CVE-2007-6600
2008-01-01Fix some missed copyright updates.Tom Lane
2008-01-01Update copyrights in source tree to 2008.Bruce Momjian
2007-12-27Wording improvementsPeter Eisentraut
2007-12-20When given a nonzero column number, pg_get_indexdef() is only supposed toTom Lane
print the index key variable or expression for that column. It was mistakenly printing ASC/DESC/NULLS FIRST/NULLS LAST decoration too --- and not only for the target column, but all columns. Someday we should have an option to extract that info (and the opclass decoration as well) for a single index column ... but today is not that day. Per bug #3829 and subsequent discussion.
2007-12-18Fix thinko in encoding check for chr()Andrew Dunstan
2007-12-18Make path_recv() and poly_recv() reject paths/polygons containing no points.Tom Lane
The zero-point case is sensible so far as the data structure is concerned, so maybe we ought to allow it sometime; but right now the textual input routines for these types don't allow it, and it seems that not all the functions for the types are prepared to cope. Report and patch by Merlin Moncure.
2007-12-08Fix mergejoin cost estimation so that we consider the statistical ranges ofTom Lane
the two join variables at both ends: not only trailing rows that need not be scanned because there cannot be a match on the other side, but initial rows that will be scanned without possibly having a match. This allows a more realistic estimate of startup cost to be made, per recent pgsql-performance discussion. In passing, fix a couple of bugs that had crept into mergejoinscansel: it was not quite up to speed for the task of estimating descending-order scans, which is a new requirement in 8.3.