summaryrefslogtreecommitdiff
path: root/src/backend/utils
AgeCommit message (Collapse)Author
2013-05-10Guard against input_rows == 0 in estimate_num_groups().Tom Lane
This case doesn't normally happen, because the planner usually clamps all row estimates to at least one row; but I found that it can arise when dealing with relations excluded by constraints. Without a defense, estimate_num_groups() can return zero, which leads to divisions by zero inside the planner as well as assertion failures in the executor. An alternative fix would be to change set_dummy_rel_pathlist() to make the size estimate for a dummy relation 1 row instead of 0, but that seemed pretty ugly; and probably someday we'll want to drop the convention that the minimum rowcount estimate is 1 row. Back-patch to 8.4, as the problem can be demonstrated that far back.
2013-04-20Fix longstanding race condition in plancache.c.Tom Lane
When creating or manipulating a cached plan for a transaction control command (particularly ROLLBACK), we must not perform any catalog accesses, since we might be in an aborted transaction. However, plancache.c busily saved or examined the search_path for every cached plan. If we were unlucky enough to do this at a moment where the path's expansion into schema OIDs wasn't already cached, we'd do some catalog accesses; and with some more bad luck such as an ill-timed signal arrival, that could lead to crashes or Assert failures, as exhibited in bug #8095 from Nachiket Vaidya. Fortunately, there's no real need to consider the search path for such commands, so we can just skip the relevant steps when the subject statement is a TransactionStmt. This is somewhat related to bug #5269, though the failure happens during initial cached-plan creation rather than revalidation. This bug has been there since the plan cache was invented, so back-patch to all supported branches.
2013-04-01Fix insecure parsing of server command-line switches.Tom Lane
An oversight in commit e710b65c1c56ca7b91f662c63d37ff2e72862a94 allowed database names beginning with "-" to be treated as though they were secure command-line switches; and this switch processing occurs before client authentication, so that even an unprivileged remote attacker could exploit the bug, needing only connectivity to the postmaster's port. Assorted exploits for this are possible, some requiring a valid database login, some not. The worst known problem is that the "-r" switch can be invoked to redirect the process's stderr output, so that subsequent error messages will be appended to any file the server can write. This can for example be used to corrupt the server's configuration files, so that it will fail when next restarted. Complete destruction of database tables is also possible. Fix by keeping the database name extracted from a startup packet fully separate from command-line switches, as had already been done with the user name field. The Postgres project thanks Mitsumasa Kondo for discovering this bug, Kyotaro Horiguchi for drafting the fix, and Noah Misch for recognizing the full extent of the danger. Security: CVE-2013-1899
2013-03-05Fix to_char() to use ASCII-only case-folding rules where appropriate.Tom Lane
formatting.c used locale-dependent case folding rules in some code paths where the result isn't supposed to be locale-dependent, for example to_char(timestamp, 'DAY'). Since the source data is always just ASCII in these cases, that usually didn't matter ... but it does matter in Turkish locales, which have unusual treatment of "i" and "I". To confuse matters even more, the misbehavior was only visible in UTF8 encoding, because in single-byte encodings we used pg_toupper/pg_tolower which don't have locale-specific behavior for ASCII characters. Fix by providing intentionally ASCII-only case-folding functions and using these where appropriate. Per bug #7913 from Adnan Dursun. Back-patch to all active branches, since it's been like this for a long time.
2013-03-04Fix overflow check in tm2timestamp (this time for sure).Tom Lane
I fixed this code back in commit 841b4a2d5, but didn't think carefully enough about the behavior near zero, which meant it improperly rejected 1999-12-31 24:00:00. Per report from Magnus Hagander.
2013-02-04Prevent execution of enum_recv() from SQL.Tom Lane
This function was misdeclared to take cstring when it should take internal. This at least allows crashing the server, and in principle an attacker might be able to use the function to examine the contents of server memory. The correct fix is to adjust the system catalog contents (and fix the regression tests that should have caught this but failed to). However, asking users to correct the catalog contents in existing installations is a pain, so as a band-aid fix for the back branches, install a check in enum_recv() to make it throw error if called with a cstring argument. We will later revert this in HEAD in favor of correcting the catalogs. Our thanks to Sumit Soni (via Secunia SVCRP) for reporting this issue. Security: CVE-2013-0255
2013-02-04Reset vacuum_defer_cleanup_age to PGC_SIGHUP.Simon Riggs
Revert commit 84725aa5efe11688633b553e58113efce4181f2e
2013-02-02Mark vacuum_defer_cleanup_age as PGC_POSTMASTER.Simon Riggs
Following bug analysis of #7819 by Tom Lane
2013-01-14Reject out-of-range dates in to_date().Tom Lane
Dates outside the supported range could be entered, but would not print reasonably, and operations such as conversion to timestamp wouldn't behave sanely either. Since this has the potential to result in undumpable table data, it seems worth back-patching. Hitoshi Harada
2012-12-11Add defenses against integer overflow in dynahash numbuckets calculations.Tom Lane
The dynahash code requires the number of buckets in a hash table to fit in an int; but since we calculate the desired hash table size dynamically, there are various scenarios where we might calculate too large a value. The resulting overflow can lead to infinite loops, division-by-zero crashes, etc. I (tgl) had previously installed some defenses against that in commit 299d1716525c659f0e02840e31fbe4dea3, but that covered only one call path. Moreover it worked by limiting the request size to work_mem, but in a 64-bit machine it's possible to set work_mem high enough that the problem appears anyway. So let's fix the problem at the root by installing limits in the dynahash.c functions themselves. Trouble report and patch by Jeff Davis.
2012-11-29Fix assorted bugs in CREATE INDEX CONCURRENTLY.Tom Lane
This patch changes CREATE INDEX CONCURRENTLY so that the pg_index flag changes it makes without exclusive lock on the index are made via heap_inplace_update() rather than a normal transactional update. The latter is not very safe because moving the pg_index tuple could result in concurrent SnapshotNow scans finding it twice or not at all, thus possibly resulting in index corruption. In addition, fix various places in the code that ought to check to make sure that the indexes they are manipulating are valid and/or ready as appropriate. These represent bugs that have existed since 8.2, since a failed CREATE INDEX CONCURRENTLY could leave a corrupt or invalid index behind, and we ought not try to do anything that might fail with such an index. Also fix RelationReloadIndexInfo to ensure it copies all the pg_index columns that are allowed to change after initial creation. Previously we could have been left with stale values of some fields in an index relcache entry. It's not clear whether this actually had any user-visible consequences, but it's at least a bug waiting to happen. This is a subset of a patch already applied in 9.2 and HEAD. Back-patch into all earlier supported branches. Tom Lane and Andres Freund
2012-11-19Improve handling of INT_MIN / -1 and related cases.Tom Lane
Some platforms throw an exception for this division, rather than returning a necessarily-overflowed result. Since we were testing for overflow after the fact, an exception isn't nice. We can avoid the problem by treating division by -1 as negation. Add some regression tests so that we'll find out if any compilers try to optimize away the overflow check conditions. Back-patch of commit 1f7cb5c30983752ff8de833de30afcaee63536d0. Per discussion with Xi Wang, though this is different from the patch he submitted.
2012-11-18Limit values of archive_timeout, post_auth_delay, auth_delay.milliseconds.Tom Lane
The previous definitions of these GUC variables allowed them to range up to INT_MAX, but in point of fact the underlying code would suffer overflows or other errors with large values. Reduce the maximum values to something that won't misbehave. There's no apparent value in working harder than this, since very large delays aren't sensible for any of these. (Note: the risk with archive_timeout is that if we're late checking the state, the timestamp difference it's being compared to might overflow. So we need some amount of slop; the choice of INT_MAX/2 is arbitrary.) Per followup investigation of bug #7670. Although this isn't a very significant fix, might as well back-patch.
2012-11-14Fix the int8 and int2 cases of (minimum possible integer) % (-1).Tom Lane
The correct answer for this (or any other case with arg2 = -1) is zero, but some machines throw a floating-point exception instead of behaving sanely. Commit f9ac414c35ea084ff70c564ab2c32adb06d5296f dealt with this in int4mod, but overlooked the fact that it also happens in int8mod (at least on my Linux x86_64 machine). Protect int2mod as well; it's not clear whether any machines fail there (mine does not) but since the test is so cheap it seems better safe than sorry. While at it, simplify the original guard in int4mod: we need only check for arg2 == -1, we don't need to check arg1 explicitly. Xi Wang, with some editing by me.
2012-11-13Fix memory leaks in record_out() and record_send().Tom Lane
record_out() leaks memory: it fails to free the strings returned by the per-column output functions, and also is careless about detoasted values. This results in a query-lifespan memory leakage when returning composite values to the client, because printtup() runs the output functions in the query-lifespan memory context. Fix it to handle these issues the same way printtup() does. Also fix a similar leakage in record_send(). (At some point we might want to try to run output functions in shorter-lived memory contexts, so that we don't need a zero-leakage policy for them. But that would be a significantly more invasive patch, which doesn't seem like material for back-patching.) In passing, use appendStringInfoCharMacro instead of appendStringInfoChar in the innermost data-copying loop of record_out, to try to shave a few cycles from this function's runtime. Per trouble report from Carlos Henrique Reimer. Back-patch to all supported versions.
2012-10-19Fix hash_search to avoid corruption of the hash table on out-of-memory.Tom Lane
An out-of-memory error during expand_table() on a palloc-based hash table would leave a partially-initialized entry in the table. This would not be harmful for transient hash tables, since they'd get thrown away anyway at transaction abort. But for long-lived hash tables, such as the relcache hash, this would effectively corrupt the table, leading to crash or other misbehavior later. To fix, rearrange the order of operations so that table enlargement is attempted before we insert a new entry, rather than after adding it to the hash table. Problem discovered by Hitoshi Harada, though this is a bit different from his proposed patch.
2012-10-19Fix ruleutils to print "INSERT INTO foo DEFAULT VALUES" correctly.Tom Lane
Per bug #7615 from Marko Tiikkaja. Apparently nobody ever tried this case before ...
2012-10-02Fix access past end of string in date parsing.Heikki Linnakangas
This affects date_in(), and a couple of other funcions that use DecodeDate(). Hitoshi Harada
2012-08-23Fix cascading privilege revoke to notice when privileges are still held.Tom Lane
If we revoke a grant option from some role X, but X still holds the option via another grant, we should not recursively revoke the privilege from role(s) Y that X had granted it to. This was supposedly fixed as one aspect of commit 4b2dafcc0b1a579ef5daaa2728223006d1ff98e9, but I must not have tested it, because in fact that code never worked: it forgot to shift the grant-option bits back over when masking the bits being revoked. Per bug #6728 from Daniel German. Back-patch to all active branches, since this has been wrong since 8.0.
2012-08-14Prevent access to external files/URLs via XML entity references.Tom Lane
xml_parse() would attempt to fetch external files or URLs as needed to resolve DTD and entity references in an XML value, thus allowing unprivileged database users to attempt to fetch data with the privileges of the database server. While the external data wouldn't get returned directly to the user, portions of it could be exposed in error messages if the data didn't parse as valid XML; and in any case the mere ability to check existence of a file might be useful to an attacker. The ideal solution to this would still allow fetching of references that are listed in the host system's XML catalogs, so that documents can be validated according to installed DTDs. However, doing that with the available libxml2 APIs appears complex and error-prone, so we're not going to risk it in a security patch that necessarily hasn't gotten wide review. So this patch merely shuts off all access, causing any external fetch to silently expand to an empty string. A future patch may improve this. In HEAD and 9.2, also suppress warnings about undefined entities, which would otherwise occur as a result of not loading referenced DTDs. Previous branches don't show such warnings anyway, due to different error handling arrangements. Credit to Noah Misch for first reporting the problem, and for much work towards a solution, though this simplistic approach was not his preference. Also thanks to Daniel Veillard for consultation. Security: CVE-2012-3489
2012-08-03Fix bugs with parsing signed hh:mm and hh:mm:ss fields in interval input.Tom Lane
DecodeInterval() failed to honor the "range" parameter (the special SQL syntax for indicating which fields appear in the literal string) if the time was signed. This seems inappropriate, so make it work like the not-signed case. The inconsistency was introduced in my commit f867339c0148381eb1d01f93ab5c79f9d10211de, which as noted in its log message was only really focused on making SQL-compliant literals work per spec. Including a sign here is not per spec, but if we're going to allow it then it's reasonable to expect it to work like the not-signed case. Also, remove bogus setting of tmask, which caused subsequent processing to think that what had been given was a timezone and not an hh:mm(:ss) field, thus confusing checks for redundant fields. This seems to be an aboriginal mistake in Lockhart's commit 2cf1642461536d0d8f3a1cf124ead0eac04eb760. Add regression test cases to illustrate the changed behaviors. Back-patch as far as 8.4, where support for spec-compliant interval literals was added. Range problem reported and diagnosed by Amit Kapila, tmask problem by me.
2012-07-15Prevent corner-case core dump in rfree().Tom Lane
rfree() failed to cope with the case that pg_regcomp() had initialized the regex_t struct but then failed to allocate any memory for re->re_guts (ie, the first malloc call in pg_regcomp() failed). It would try to touch the guts struct anyway, and thus dump core. This is a sufficiently narrow corner case that it's not surprising it's never been seen in the field; but still a bug is a bug, so patch all active branches. Noted while investigating whether we need to call pg_regfree after a failure return from pg_regcomp. Other than this bug, it turns out we don't, so adjust comments appropriately.
2012-07-10Back-patch fix for extraction of fixed prefixes from regular expressions.Tom Lane
Back-patch of commits 628cbb50ba80c83917b07a7609ddec12cda172d0 and c6aae3042be5249e672b731ebeb21875b5343010. This has been broken since 7.3, so back-patch to all supported branches.
2012-07-10Back-patch addition of pg_wchar-to-multibyte conversion functionality.Tom Lane
Back-patch of commits 72dd6291f216440f6bb61a8733729a37c7e3b2d2, f6a05fd973a102f7e66c491d3f854864b8d24844, and 60e9c224a197aa37abb1aa3aefa3aad42da61f7f. This is needed to support fixing the regex prefix extraction bug in back branches.
2012-07-09Refactor pattern_fixed_prefix() to avoid dealing in incomplete patterns.Tom Lane
Previously, pattern_fixed_prefix() was defined to return whatever fixed prefix it could extract from the pattern, plus the "rest" of the pattern. That definition was sensible for LIKE patterns, but not so much for regexes, where reconstituting a valid pattern minus the prefix could be quite tricky (certainly the existing code wasn't doing that correctly). Since the only thing that callers ever did with the "rest" of the pattern was to pass it to like_selectivity() or regex_selectivity(), let's cut out the middle-man and just have pattern_fixed_prefix's subroutines do this directly. Then pattern_fixed_prefix can return a simple selectivity number, and the question of how to cope with partial patterns is removed from its API specification. While at it, adjust the API spec so that callers who don't actually care about the pattern's selectivity (which is a lot of them) can pass NULL for the selectivity pointer to skip doing the work of computing a selectivity estimate. This patch is only an API refactoring that doesn't actually change any processing, other than allowing a little bit of useless work to be skipped. However, it's necessary infrastructure for my upcoming fix to regex prefix extraction, because after that change there won't be any simple way to identify the "rest" of the regex, not even to the low level of fidelity needed by regex_selectivity. We can cope with that if regex_fixed_prefix and regex_selectivity communicate directly, but not if we have to work within the old API. Hence, back-patch to all active branches.
2012-05-31Revert back-branch changes in behavior of age(xid).Tom Lane
Per discussion, it does not seem like a good idea to change the behavior of age(xid) in a minor release, even though the old definition causes the function to fail on hot standby slaves. Therefore, revert commit 5829387381d2e4edf84652bb5a712f6185860670 and follow-on commits in the back branches only.
2012-05-30Ignore SECURITY DEFINER and SET attributes for a PL's call handler.Tom Lane
It's not very sensible to set such attributes on a handler function; but if one were to do so, fmgr.c went into infinite recursion because it would call fmgr_security_definer instead of the handler function proper. There is no way for fmgr_security_definer to know that it ought to call the handler and not the original function referenced by the FmgrInfo's fn_oid, so it tries to do the latter, causing the whole process to start over again. Ordinarily such misconfiguration of a procedural language's handler could be written off as superuser error. However, because we allow non-superuser database owners to create procedural languages and the handler for such a language becomes owned by the database owner, it is possible for a database owner to crash the backend, which ideally shouldn't be possible without superuser privileges. In 9.2 and up we will adjust things so that the handler functions are always owned by superusers, but in existing branches this is a minor security fix. Problem noted by Noah Misch (after several of us had failed to detect it :-(). This is CVE-2012-2655.
2012-05-30Expand the allowed range of timezone offsets to +/-15:59:59 from Greenwich.Tom Lane
We used to only allow offsets less than +/-13 hours, then it was +/14, then it was +/-15. That's still not good enough though, as per today's bug report from Patric Bechtel. This time I actually looked through the Olson timezone database to find the largest offsets used anywhere. The winners are Asia/Manila, at -15:56:00 until 1844, and America/Metlakatla, at +15:13:42 until 1867. So we'd better allow offsets less than +/-16 hours. Given the history, we are way overdue to have some greppable #define symbols controlling this, so make some ... and also remove an obsolete comment that didn't get fixed the last time. Back-patch to all supported branches.
2012-05-25Fix string truncation to be multibyte-aware in text_name and bpchar_name.Tom Lane
Previously, casts to name could generate invalidly-encoded results. Also, make these functions match namein() more exactly, by consistently using palloc0() instead of ad-hoc zeroing code. Back-patch to all supported branches. Karl Schnaitter and Tom Lane
2012-05-11Remove extraneous #include "storage/proc.h"Simon Riggs
2012-05-11Ensure age() returns a stable value rather than the latest valueSimon Riggs
2012-05-09Avoid xid error from age() function when run on Hot StandbySimon Riggs
2012-04-27Fix printing of whole-row Vars at top level of a SELECT targetlist.Tom Lane
Normally whole-row Vars are printed as "tabname.*". However, that does not work at top level of a targetlist, because per SQL standard the parser will think that the "*" should result in column-by-column expansion; which is not at all what a whole-row Var implies. We used to just print the table name in such cases, which works most of the time; but it fails if the table name matches a column name available anywhere in the FROM clause. This could lead for instance to a view being interpreted differently after dump and reload. Adding parentheses doesn't fix it, but there is a reasonably simple kluge we can use instead: attach a no-op cast, so that the "*" isn't syntactically at top level anymore. This makes the printing of such whole-row Vars a lot more consistent with other Vars, and may indeed fix more cases than just the reported one; I'm suspicious that cases involving schema qualification probably didn't work properly before, either. Per bug report and fix proposal from Abbas Butt, though this patch is quite different in detail from his. Back-patch to all supported versions.
2012-04-09Fix an Assert that turns out to be reachable after all.Tom Lane
estimate_num_groups() gets unhappy with create table empty(); select * from empty except select * from empty e2; I can't see any actual use-case for such a query (and the table is illegal per SQL spec), but it seems like a good idea that it not cause an assert failure.
2012-01-30Accept a non-existent value in "ALTER USER/DATABASE SET ..." command.Heikki Linnakangas
When default_text_search_config, default_tablespace, or temp_tablespaces setting is set per-user or per-database, with an "ALTER USER/DATABASE SET ..." statement, don't throw an error if the text search configuration or tablespace does not exist. In case of text search configuration, even if it doesn't exist in the current database, it might exist in another database, where the setting is intended to have its effect. This behavior is now the same as search_path's. Tablespaces are cluster-wide, so the same argument doesn't hold for tablespaces, but there's a problem with pg_dumpall: it dumps "ALTER USER SET ..." statements before the "CREATE TABLESPACE" statements. Arguably that's pg_dumpall's fault - it should dump the statements in such an order that the tablespace is created first and then the "ALTER USER SET default_tablespace ..." statements after that - but it seems better to be consistent with search_path and default_text_search_config anyway. Besides, you could still create a dump that throws an error, by creating the tablespace, running "ALTER USER SET default_tablespace", then dropping the tablespace and running pg_dumpall on that. Backpatch to all supported versions.
2011-12-12Revert the behavior of inet/cidr functions to not unpack the arguments.Heikki Linnakangas
I forgot to change the functions to use the PG_GETARG_INET_PP() macro, when I changed DatumGetInetP() to unpack the datum, like Datum*P macros usually do. Also, I screwed up the definition of the PG_GETARG_INET_PP() macro, and didn't notice because it wasn't used. This fixes the memory leak when sorting inet values, as reported by Jochen Erwied and debugged by Andres Freund. Backpatch to 8.3, like the previous patch that broke it.
2011-12-01Fix getTypeIOParam to support type record[].Tom Lane
Since record[] uses array_in, it needs to have its element type passed as typioparam. In HEAD and 9.1, this fix essentially reverts commit 9bc933b2125a5358722490acbc50889887bf7680, which was a hack that is no longer needed since domains don't set their typelem anymore. Before that, adjust the logic so that only domains are excluded from being treated like arrays, rather than assuming that only base types should be included. Add a regression test to demonstrate the need for this. Per report from Maxim Boguk. Back-patch to 8.4, where type record[] was added.
2011-11-30Tweak previous patch to ensure edata->filename always gets initialized.Tom Lane
On a platform that isn't supplying __FILE__, previous coding would either crash or give a stale result for the filename string. Not sure how likely that is, but the original code catered for it, so let's keep doing so.
2011-11-29Strip file names reported in error messages in vpath buildsPeter Eisentraut
In vpath builds, the __FILE__ macro that is used in verbose error reports contains the full absolute file name, which makes the error messages excessively verbose. So keep only the base name, thus matching the behavior of non-vpath builds.
2011-11-08Make DatumGetInetP() unpack inet datums with a 1-byte header, and addHeikki Linnakangas
a new macro, DatumGetInetPP(), that does not. This brings these macros in line with other DatumGet*P() macros. Backpatch to 8.3, where 1-byte header varlenas were introduced.
2011-11-01Fix race condition with toast table access from a stale syscache entry.Tom Lane
If a tuple in a syscache contains an out-of-line toasted field, and we try to fetch that field shortly after some other transaction has committed an update or deletion of the tuple, there is a race condition: vacuum could come along and remove the toast tuples before we can fetch them. This leads to transient failures like "missing chunk number 0 for toast value NNNNN in pg_toast_2619", as seen in recent reports from Andrew Hammond and Tim Uckun. The design idea of syscache is that access to stale syscache entries should be prevented by relation-level locks, but that fails for at least two cases where toasted fields are possible: ANALYZE updates pg_statistic rows without locking out sessions that might want to plan queries on the same table, and CREATE OR REPLACE FUNCTION updates pg_proc rows without any meaningful lock at all. The least risky fix seems to be an idea that Heikki suggested when we were dealing with a related problem back in August: forcibly detoast any out-of-line fields before putting a tuple into syscache in the first place. This avoids the problem because at the time we fetch the parent tuple from the catalog, we should be holding an MVCC snapshot that will prevent removal of the toast tuples, even if the parent tuple is outdated immediately after we fetch it. (Note: I'm not convinced that this statement holds true at every instant where we could be fetching a syscache entry at all, but it does appear to hold true at the times where we could fetch an entry that could have a toasted field. We will need to be a bit wary of adding toast tables to low-level catalogs that don't have them already.) An additional benefit is that subsequent uses of the syscache entry should be faster, since they won't have to detoast the field. Back-patch to all supported versions. The problem is significantly harder to reproduce in pre-9.0 releases, because of their willingness to flush every entry in a syscache whenever the underlying catalog is vacuumed (cf CatalogCacheFlushRelation); but there is still a window for trouble.
2011-10-29Fix assorted bogosities in cash_in() and cash_out().Tom Lane
cash_out failed to handle multiple-byte thousands separators, as per bug #6277 from Alexander Law. In addition, cash_in didn't handle that either, nor could it handle multiple-byte positive_sign. Both routines failed to support multiple-byte mon_decimal_point, which I did not think was worth changing, but at least now they check for the possibility and fall back to using '.' rather than emitting invalid output. Also, make cash_in handle trailing negative signs, which formerly it would reject. Since cash_out generates trailing negative signs whenever the locale tells it to, this last omission represents a fail-to-reload-dumped-data bug. IMO that justifies patching this all the way back.
2011-10-23Don't trust deferred-unique indexes for join removal.Tom Lane
The uniqueness condition might fail to hold intra-transaction, and assuming it does can give incorrect query results. Per report from Marti Raudsepp, though this is not his proposed patch. Back-patch to 9.0, where both these features were introduced. In the released branches, add the new IndexOptInfo field to the end of the struct, to try to minimize ABI breakage for third-party code that may be examining that struct.
2011-10-04Add sourcefile/sourceline data to EXEC_BACKEND GUC transmission files.Tom Lane
This oversight meant that on Windows, the pg_settings view would not display source file or line number information for values coming from postgresql.conf, unless the backend had received a SIGHUP since starting. In passing, also make the error detection in read_nondefault_variables a tad more thorough, and fix it to not lose precision on float GUCs (these changes are already in HEAD as of my previous commit).
2011-09-07Fix corner case bug in numeric to_char().Tom Lane
Trailing-zero stripping applied by the FM specifier could strip zeroes to the left of the decimal point, for a format with no digit positions after the decimal point (such as "FM999."). Reported and diagnosed by Marti Raudsepp, though I didn't use his patch.
2011-09-06Avoid possibly accessing off the end of memory in SJIS2004 conversion.Tom Lane
The code in shift_jis_20042euc_jis_2004() would fetch two bytes even when only one remained in the string. Since conversion functions aren't supposed to assume null-terminated input, this poses a small risk of fetching past the end of memory and incurring SIGSEGV. No such crash has been identified in the field, but we've certainly seen the equivalent happen in other code paths, so patch this one all the way back. Report and patch by Noah Misch.
2011-09-01Further repair of eqjoinsel ndistinct-clamping logic.Tom Lane
Examination of examples provided by Mark Kirkwood and others has convinced me that actually commit 7f3eba30c9d622d1981b1368f2d79ba0999cdff2 was quite a few bricks shy of a load. The useful part of that patch was clamping ndistinct for the inner side of a semi or anti join, and the reason why that's needed is that it's the only way that restriction clauses eliminating rows from the inner relation can affect the estimated size of the join result. I had not clearly understood why the clamping was appropriate, and so mis-extrapolated to conclude that we should clamp ndistinct for the outer side too, as well as for both sides of regular joins. These latter actions were all wrong, and are reverted with this patch. In addition, the clamping logic is now made to affect the behavior of both paths in eqjoinsel_semi, with or without MCV lists to compare. When we have MCVs, we suppose that the most common values are the ones that are most likely to survive the decimation resulting from a lower restriction clause, so we think of the clamping as eliminating non-MCV values, or potentially even the least-common MCVs for the inner relation. Back-patch to 8.4, same as previous fixes in this area.
2011-08-31Improve eqjoinsel's ndistinct clamping to work for multiple levels of join.Tom Lane
This patch fixes an oversight in my commit 7f3eba30c9d622d1981b1368f2d79ba0999cdff2 of 2008-10-23. That patch accounted for baserel restriction clauses that reduced the number of rows coming out of a table (and hence the number of possibly-distinct values of a join variable), but not for join restriction clauses that might have been applied at a lower level of join. To account for the latter, look up the sizes of the min_lefthand and min_righthand inputs of the current join, and clamp with those in the same way as for the base relations. Noted while investigating a complaint from Ben Chobot, although this in itself doesn't seem to explain his report. Back-patch to 8.4; previous versions used different estimation methods for which this heuristic isn't relevant.
2011-08-30Fix a missed case in code for "moving average" estimate of reltuples.Tom Lane
It is possible for VACUUM to scan no pages at all, if the visibility map shows that all pages are all-visible. In this situation VACUUM has no new information to report about the relation's tuple density, so it wasn't changing pg_class.reltuples ... but it updated pg_class.relpages anyway. That's wrong in general, since there is no evidence to justify changing the density ratio reltuples/relpages, but it's particularly bad if the previous state was relpages=reltuples=0, which means "unknown tuple density". We just replaced "unknown" with "zero". ANALYZE would eventually recover from this, but it could take a lot of repetitions of ANALYZE to do so if the relation size is much larger than the maximum number of pages ANALYZE will scan, because of the moving-average behavior introduced by commit b4b6923e03f4d29636a94f6f4cc2f5cf6298b8c8. The only known situation where we could have relpages=reltuples=0 and yet the visibility map asserts everything's visible is immediately following a pg_upgrade. It might be advisable for pg_upgrade to try to preserve the relpages/reltuples statistics; but in any case this code is wrong on its own terms, so fix it. Per report from Sergey Koposov. Back-patch to 8.4, where the visibility map was introduced, same as the previous change.
2011-08-26Fix potential memory clobber in tsvector_concat().Tom Lane
tsvector_concat() allocated its result workspace using the "conservative" estimate of the sum of the two input tsvectors' sizes. Unfortunately that wasn't so conservative as all that, because it supposed that the number of pad bytes required could not grow. Which it can, as per test case from Jesper Krogh, if there's a mix of lexemes with positions and lexemes without them in the input data. The fix is to assume that we might add a not-previously-present pad byte for each and every lexeme in the two inputs; which really is conservative, but it doesn't seem worthwhile to try to be more precise. This is an aboriginal bug in tsvector_concat, so back-patch to all versions containing it.