summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2015-07-28Disable ssl renegotiation by default.Andres Freund
While postgres' use of SSL renegotiation is a good idea in theory, it turned out to not work well in practice. The specification and openssl's implementation of it have lead to several security issues. Postgres' use of renegotiation also had its share of bugs. Additionally OpenSSL has a bunch of bugs around renegotiation, reported and open for years, that regularly lead to connections breaking with obscure error messages. We tried increasingly complex workarounds to get around these bugs, but we didn't find anything complete. Since these connection breakages often lead to hard to debug problems, e.g. spuriously failing base backups and significant latency spikes when synchronous replication is used, we have decided to change the default setting for ssl renegotiation to 0 (disabled) in the released backbranches and remove it entirely in 9.5 and master.. Author: Michael Paquier, with changes by me Discussion: 20150624144148.GQ4797@alap3.anarazel.de Backpatch: 9.0-9.4; 9.5 and master get a different patch
2015-07-28Remove an unsafe Assert, and explain join_clause_is_movable_into() better.Tom Lane
join_clause_is_movable_into() is approximate, in the sense that it might sometimes return "false" when actually it would be valid to push the given join clause down to the specified level. This is okay ... but there was an Assert in get_joinrel_parampathinfo() that's only safe if the answers are always exact. Comment out the Assert, and add a bunch of commentary to clarify what's going on. Per fuzz testing by Andreas Seltenreich. The added regression test is a pretty silly query, but it's based on his crasher example. Back-patch to 9.2 where the faulty logic was introduced.
2015-07-27Don't assume that PageIsEmpty() returns true on an all-zeros page.Heikki Linnakangas
It does currently, and I don't see us changing that any time soon, but we don't make that assumption anywhere else. Per Tom Lane's suggestion. Backpatch to 9.2, like the previous patch that added this assumption.
2015-07-27Reuse all-zero pages in GIN.Heikki Linnakangas
In GIN, an all-zeros page would be leaked forever, and never reused. Just add them to the FSM in vacuum, and they will be reinitialized when grabbed from the FSM. On master and 9.5, attempting to access the page's opaque struct also caused an assertion failure, although that was otherwise harmless. Reported by Jeff Janes. Backpatch to all supported versions.
2015-07-27Fix handling of all-zero pages in SP-GiST vacuum.Heikki Linnakangas
SP-GiST initialized an all-zeros page at vacuum, but that was not WAL-logged, which is not safe. You might get a torn page write, when it gets flushed to disk, and end-up with a half-initialized index page. To fix, leave it in the all-zeros state, and add it to the FSM. It will be initialized when reused. Also don't set the page-deleted flag when recycling an empty page. That was also not WAL-logged, and a torn write of that would cause the page to have an invalid checksum. Backpatch to 9.2, where SP-GiST indexes were added.
2015-07-26Make entirely-dummy appendrels get marked as such in set_append_rel_size.Tom Lane
The planner generally expects that the estimated rowcount of any relation is at least one row, *unless* it has been proven empty by constraint exclusion or similar mechanisms, which is marked by installing a dummy path as the rel's cheapest path (cf. IS_DUMMY_REL). When I split up allpaths.c's processing of base rels into separate set_base_rel_sizes and set_base_rel_pathlists steps, the intention was that dummy rels would get marked as such during the "set size" step; this is what justifies an Assert in indxpath.c's get_loop_count that other relations should either be dummy or have positive rowcount. Unfortunately I didn't get that quite right for append relations: if all the child rels have been proven empty then set_append_rel_size would come up with a rowcount of zero, which is correct, but it didn't then do set_dummy_rel_pathlist. (We would have ended up with the right state after set_append_rel_pathlist, but that's too late, if we generate indexpaths for some other rel first.) In addition to fixing the actual bug, I installed an Assert enforcing this convention in set_rel_size; that then allows simplification of a couple of now-redundant tests for zero rowcount in set_append_rel_size. Also, to cover the possibility that third-party FDWs have been careless about not returning a zero rowcount estimate, apply clamp_row_est to whatever an FDW comes up with as the rows estimate. Per report from Andreas Seltenreich. Back-patch to 9.2. Earlier branches did not have the separation between set_base_rel_sizes and set_base_rel_pathlists steps, so there was no intermediate state where an appendrel would have had inconsistent rowcount and pathlist. It's possible that adding the Assert to set_rel_size would be a good idea in older branches too; but since they're not under development any more, it's likely not worth the trouble.
2015-07-25Restore use of zlib default compression in pg_dump directory mode.Andrew Dunstan
This was broken by commit 0e7e355f27302b62af3e1add93853ccd45678443 and friends, which ignored the fact that gzopen() will treat "-1" in the mode argument as an invalid character, which it ignores, and a flag for compression level 1. Now, when this value is encountered no compression level flag is passed to gzopen, leaving it to use the zlib default. Also, enforce the documented allowed range for pg_dump's -Z option, namely 0 .. 9, and remove some consequently dead code from pg_backup_tar.c. Problem reported by Marc Mamin. Backpatch to 9.1, like the patch that introduced the bug.
2015-07-23Fix off-by-one error in calculating subtrans/multixact truncation point.Heikki Linnakangas
If there were no subtransactions (or multixacts) active, we would calculate the oldestxid == next xid. That's correct, but if next XID happens to be on the next pg_subtrans (pg_multixact) page, the page does not exist yet, and SimpleLruTruncate will produce an "apparent wraparound" warning. The warning is harmless in this case, but looks very alarming to users. Backpatch to all supported versions. Patch and analysis by Thomas Munro.
2015-07-20Fix (some of) pltcl memory usageAlvaro Herrera
As reported by Bill Parker, PL/Tcl did not validate some malloc() calls against NULL return. Fix by using palloc() in a new long-lived memory context instead. This allows us to simplify error handling too, by simply deleting the memory context instead of doing retail frees. There's still a lot that could be done to improve PL/Tcl's memory handling ... This is pretty ancient, so backpatch all the way back. Author: Michael Paquier and Álvaro Herrera Discussion: https://www.postgresql.org/message-id/CAFrbyQwyLDYXfBOhPfoBGqnvuZO_Y90YgqFM11T2jvnxjLFmqw@mail.gmail.com
2015-07-18Make WaitLatchOrSocket's timeout detection more robust.Tom Lane
In the previous coding, timeout would be noticed and reported only when poll() or socket() returned zero (or the equivalent behavior on Windows). Ordinarily that should work well enough, but it seems conceivable that we could get into a state where poll() always returns a nonzero value --- for example, if it is noticing a condition on one of the file descriptors that we do not think is reason to exit the loop. If that happened, we'd be in a busy-wait loop that would fail to terminate even when the timeout expires. We can make this more robust at essentially no cost, by deciding to exit of our own accord if we compute a zero or negative time-remaining-to-wait. Previously the code noted this but just clamped the time-remaining to zero, expecting that we'd detect timeout on the next loop iteration. Back-patch to 9.2. While 9.1 had a version of WaitLatchOrSocket, it was primitive compared to later versions, and did not guarantee reliable detection of timeouts anyway. (Essentially, this is a refinement of commit 3e7fdcffd6f77187, which was back-patched only as far as 9.2.)
2015-07-17AIX: Test the -qlonglong option before use.Noah Misch
xlc provides "long long" unconditionally at C99-compatible language levels, and this option provokes a warning. The warning interferes with "configure" tests that fail in response to any warning. Notably, before commit 85a2a8903f7e9151793308d0638621003aded5ae, it interfered with the test for -qnoansialias. Back-patch to 9.0 (all supported versions).
2015-07-16Fix a low-probability crash in our qsort implementation.Tom Lane
It's standard for quicksort implementations, after having partitioned the input into two subgroups, to recurse to process the smaller partition and then handle the larger partition by iterating. This method guarantees that no more than log2(N) levels of recursion can be needed. However, Bentley and McIlroy argued that checking to see which partition is smaller isn't worth the cycles, and so their code doesn't do that but just always recurses on the left partition. In most cases that's fine; but with worst-case input we might need O(N) levels of recursion, and that means that qsort could be driven to stack overflow. Such an overflow seems to be the only explanation for today's report from Yiqing Jin of a SIGSEGV in med3_tuple while creating an index of a couple billion entries with a very large maintenance_work_mem setting. Therefore, let's spend the few additional cycles and lines of code needed to choose the smaller partition for recursion. Also, fix up the qsort code so that it properly uses size_t not int for some intermediate values representing numbers of items. This would only be a live risk when sorting more than INT_MAX bytes (in qsort/qsort_arg) or tuples (in qsort_tuple), which I believe would never happen with any caller in the current core code --- but perhaps it could happen with call sites in third-party modules? In any case, this is trouble waiting to happen, and the corrected code is probably if anything shorter and faster than before, since it removes sign-extension steps that had to happen when converting between int and size_t. In passing, move a couple of CHECK_FOR_INTERRUPTS() calls so that it's not necessary to preserve the value of "r" across them, and prettify the output of gen_qsort_tuple.pl a little. Back-patch to all supported branches. The odds of hitting this issue are probably higher in 9.4 and up than before, due to the new ability to allocate sort workspaces exceeding 1GB, but there's no good reason to believe that it's impossible to crash older branches this way.
2015-07-15AIX: Link the postgres executable with -Wl,-brtllib.Noah Misch
This allows PostgreSQL modules and their dependencies to have undefined symbols, resolved at runtime. Perl module shared objects rely on that in Perl 5.8.0 and later. This fixes the crash when PL/PerlU loads such modules, as the hstore_plperl test suite does. Module authors can link using -Wl,-G to permit undefined symbols; by default, linking will fail as it has. Back-patch to 9.0 (all supported versions).
2015-07-12Fix assorted memory leaks.Tom Lane
Per Coverity (not that any of these are so non-obvious that they should not have been caught before commit). The extent of leakage is probably minor to unnoticeable, but a leak is a leak. Back-patch as necessary. Michael Paquier
2015-07-09Fix postmaster's handling of a startup-process crash.Tom Lane
Ordinarily, a failure (unexpected exit status) of the startup subprocess should be considered fatal, so the postmaster should just close up shop and quit. However, if we sent the startup process a SIGQUIT or SIGKILL signal, the failure is hardly "unexpected", and we should attempt restart; this is necessary for recovery from ordinary backend crashes in hot-standby scenarios. I attempted to implement the latter rule with a two-line patch in commit 442231d7f71764b8c628044e7ce2225f9aa43b67, but it now emerges that that patch was a few bricks shy of a load: it failed to distinguish the case of a signaled startup process from the case where the new startup process crashes before reaching database consistency. That resulted in infinitely respawning a new startup process only to have it crash again. To handle this properly, we really must track whether we have sent the *current* startup process a kill signal. Rather than add yet another ad-hoc boolean to the postmaster's state, I chose to unify this with the existing RecoveryError flag into an enum tracking the startup process's state. That seems more consistent with the postmaster's general state machine design. Back-patch to 9.0, like the previous patch.
2015-07-08Fix null pointer dereference in "\c" psql command.Noah Misch
The psql crash happened when no current connection existed. (The second new check is optional given today's undocumented NULL argument handling in PQhost() etc.) Back-patch to 9.0 (all supported versions).
2015-07-07Improve handling of out-of-memory in libpq.Heikki Linnakangas
If an allocation fails in the main message handling loop, pqParseInput3 or pqParseInput2, it should not be treated as "not enough data available yet". Otherwise libpq will wait indefinitely for more data to arrive from the server, and gets stuck forever. This isn't a complete fix - getParamDescriptions and getCopyStart still have the same issue, but it's a step in the right direction. Michael Paquier and me. Backpatch to all supported versions.
2015-07-07Turn install.bat into a pure one line wrapper fort he perl script.Heikki Linnakangas
Build.bat and vcregress.bat got similar treatment years ago. I'm not sure why install.bat wasn't treated at the same time, but it seems like a good idea anyway. The immediate problem with the old install.bat was that it had quoting issues, and wouldn't work if the target directory's name contained spaces. This fixes that problem. I committed this to master yesterday, this is a backpatch of the same for all supported versions.
2015-07-05Make numeric form of PG version number readily available in Makefiles.Tom Lane
Expose PG_VERSION_NUM (e.g., "90600") as a Make variable; but for consistency with the other Make variables holding similar info, call the variable just VERSION_NUM not PG_VERSION_NUM. There was some discussion of making this value available as a pg_config value as well. However, that would entail substantially more work than this two-line patch. Given that there was not exactly universal consensus that we need this at all, let's just do a minimal amount of work for now. Back-patch of commit a5d489ccb7e613c7ca3be6141092b8c1d2c13fa7, so that this variable is actually useful for its intended purpose sometime before 2020. Michael Paquier, reviewed by Pavel Stehule
2015-07-03PL/Perl: Add alternative expected file for Perl 5.22Peter Eisentraut
2015-06-27Revoke incorrectly applied patch versionSimon Riggs
2015-06-27Avoid hot standby cancels from VAC FREEZESimon Riggs
VACUUM FREEZE generated false cancelations of standby queries on an otherwise idle master. Caused by an off-by-one error on cutoff_xid which goes back to original commit. Backpatch to all versions 9.0+ Analysis and report by Marco Nenciarini Bug fix by Simon Riggs
2015-06-25Fix the logic for putting relations into the relcache init file.Tom Lane
Commit f3b5565dd4e59576be4c772da364704863e6a835 was a couple of bricks shy of a load; specifically, it missed putting pg_trigger_tgrelid_tgname_index into the relcache init file, because that index is not used by any syscache. However, we have historically nailed that index into cache for performance reasons. The upshot was that load_relcache_init_file always decided that the init file was busted and silently ignored it, resulting in a significant hit to backend startup speed. To fix, reinstantiate RelationIdIsInInitFile() as a wrapper around RelationSupportsSysCache(), which can know about additional relations that should be in the init file despite being unknown to syscache.c. Also install some guards against future mistakes of this type: make write_relcache_init_file Assert that all nailed relations get written to the init file, and make load_relcache_init_file emit a WARNING if it takes the "wrong number of nailed relations" exit path. Now that we remove the init files during postmaster startup, that case should never occur in the field, even if we are starting a minor-version update that added or removed rels from the nailed set. So the warning shouldn't ever be seen by end users, but it will show up in the regression tests if somebody breaks this logic. Back-patch to all supported branches, like the previous commit.
2015-06-22Improve inheritance_planner()'s performance for large inheritance sets.Tom Lane
Commit c03ad5602f529787968fa3201b35c119bbc6d782 introduced a planner performance regression for UPDATE/DELETE on large inheritance sets. It required copying the append_rel_list (which is of size proportional to the number of inherited tables) once for each inherited table, thus resulting in O(N^2) time and memory consumption. While it's difficult to avoid that in general, the extra work only has to be done for append_rel_list entries that actually reference subquery RTEs, which inheritance-set entries will not. So we can buy back essentially all of the loss in cases without subqueries in FROM; and even for those, the added work is mainly proportional to the number of UNION ALL subqueries. Back-patch to 9.2, like the previous commit. Tom Lane and Dean Rasheed, per a complaint from Thomas Munro.
2015-06-21Truncate strings in tarCreateHeader() with strlcpy(), not sprintf().Noah Misch
This supplements the GNU libc bug #6530 workarounds introduced in commit 54cd4f04576833abc394e131288bf3dd7dcf4806. On affected systems, a tar-format pg_basebackup failed when some filename beneath the data directory was not valid character data in the postmaster/walsender locale. Back-patch to 9.1, where pg_basebackup was introduced. Extant, bug-prone conversion specifications receive only ASCII bytes or involve low-importance messages.
2015-06-20Fix thinko in comment (launcher -> worker)Alvaro Herrera
2015-06-19Clamp autovacuum launcher sleep time to 5 minutesAlvaro Herrera
This avoids the problem that it might go to sleep for an unreasonable amount of time in unusual conditions like the server clock moving backwards an unreasonable amount of time. (Simply moving the server clock forward again doesn't solve the problem unless you wake up the autovacuum launcher manually, say by sending it SIGHUP). Per trouble report from Prakash Itnal in https://www.postgresql.org/message-id/CAHC5u79-UqbapAABH2t4Rh2eYdyge0Zid-X=Xz-ZWZCBK42S0Q@mail.gmail.com Analyzed independently by Haribabu Kommi and Tom Lane.
2015-06-15Check for out of memory when allocating sqlca.Michael Meskes
Patch by Michael Paquier
2015-06-15Fix memory leak in ecpglib's connect function.Michael Meskes
Patch by Michael Paquier
2015-06-13Fixed some memory leaks in ECPG.Michael Meskes
Patch by Michael Paquier Conflicts: src/interfaces/ecpg/preproc/variable.c
2015-06-13Fix intoasc() in Informix compat lib. This function used to be a noop.Michael Meskes
Patch by Michael Paquier
2015-06-12Improve error message and hint for ALTER COLUMN TYPE can't-cast failure.Tom Lane
We already tried to improve this once, but the "improved" text was rather off-target if you had provided a USING clause. Also, it seems helpful to provide the exact text of a suggested USING clause, so users can just copy-and-paste it when needed. Per complaint from Keith Rarick and a suggestion from Merlin Moncure. Back-patch to 9.2 where the current wording was adopted.
2015-06-09Stamp 9.2.13.REL9_2_13Tom Lane
2015-06-09Report more information if pg_perm_setlocale() fails at startup.Tom Lane
We don't know why a few Windows users have seen this fail, but the taciturnity of the error message certainly isn't helping debug it. Let's at least find out which LC category isn't working.
2015-06-08Allow HotStandbyActiveInReplay() to be called in single user mode.Andres Freund
HotStandbyActiveInReplay, introduced in 061b079f, only allowed WAL replay to happen in the startup process, missing the single user case. This buglet is fairly harmless as it only causes problems when single user mode in an assertion enabled build is used to replay a btree vacuum record. Backpatch to 9.2. 061b079f was backpatched further, but the assertion was not.
2015-06-07Use a safer method for determining whether relcache init file is stale.Tom Lane
When we invalidate the relcache entry for a system catalog or index, we must also delete the relcache "init file" if the init file contains a copy of that rel's entry. The old way of doing this relied on a specially maintained list of the OIDs of relations present in the init file: we made the list either when reading the file in, or when writing the file out. The problem is that when writing the file out, we included only rels present in our local relcache, which might have already suffered some deletions due to relcache inval events. In such cases we correctly decided not to overwrite the real init file with incomplete data --- but we still used the incomplete initFileRelationIds list for the rest of the current session. This could result in wrong decisions about whether the session's own actions require deletion of the init file, potentially allowing an init file created by some other concurrent session to be left around even though it's been made stale. Since we don't support changing the schema of a system catalog at runtime, the only likely scenario in which this would cause a problem in the field involves a "vacuum full" on a catalog concurrently with other activity, and even then it's far from easy to provoke. Remarkably, this has been broken since 2002 (in commit 786340441706ac1957a031f11ad1c2e5b6e18314), but we had never seen a reproducible test case until recently. If it did happen in the field, the symptoms would probably involve unexpected "cache lookup failed" errors to begin with, then "could not open file" failures after the next checkpoint, as all accesses to the affected catalog stopped working. Recovery would require manually removing the stale "pg_internal.init" file. To fix, get rid of the initFileRelationIds list, and instead consult syscache.c's list of relations used in catalog caches to decide whether a relation is included in the init file. This should be a tad more efficient anyway, since we're replacing linear search of a list with ~100 entries with a binary search. It's a bit ugly that the init file contents are now so directly tied to the catalog caches, but in practice that won't make much difference. Back-patch to all supported branches.
2015-06-05Fix incorrect order of database-locking operations in InitPostgres().Tom Lane
We should set MyProc->databaseId after acquiring the per-database lock, not beforehand. The old way risked deadlock against processes trying to copy or delete the target database, since they would first acquire the lock and then wait for processes with matching databaseId to exit; that left a window wherein an incoming process could set its databaseId and then block on the lock, while the other process had the lock and waited in vain for the incoming process to exit. CountOtherDBBackends() would time out and fail after 5 seconds, so this just resulted in an unexpected failure not a permanent lockup, but it's still annoying when it happens. A real-world example of a use-case is that short-duration connections to a template database should not cause CREATE DATABASE to fail. Doing it in the other order should be fine since the contract has always been that processes searching the ProcArray for a database ID must hold the relevant per-database lock while searching. Thus, this actually removes the former race condition that required an assumption that storing to MyProc->databaseId is atomic. It's been like this for a long time, so back-patch to all active branches.
2015-06-01Stamp 9.2.12.REL9_2_12Tom Lane
2015-05-29Remove special cases for ETXTBSY from new fsync'ing logic.Tom Lane
The argument that this is a sufficiently-expected case to be silently ignored seems pretty thin. Andres had brought it up back when we were still considering that most fsync failures should be hard errors, and it probably would be legit not to fail hard for ETXTBSY --- but the same is true for EROFS and other cases, which is why we gave up on hard failures. ETXTBSY is surely not a normal case, so logging the failure seems fine from here.
2015-05-28Fix fsync-at-startup code to not treat errors as fatal.Tom Lane
Commit 2ce439f3379aed857517c8ce207485655000fc8e introduced a rather serious regression, namely that if its scan of the data directory came across any un-fsync-able files, it would fail and thereby prevent database startup. Worse yet, symlinks to such files also caused the problem, which meant that crash restart was guaranteed to fail on certain common installations such as older Debian. After discussion, we agreed that (1) failure to start is worse than any consequence of not fsync'ing is likely to be, therefore treat all errors in this code as nonfatal; (2) we should not chase symlinks other than those that are expected to exist, namely pg_xlog/ and tablespace links under pg_tblspc/. The latter restriction avoids possibly fsync'ing a much larger part of the filesystem than intended, if the user has left random symlinks hanging about in the data directory. This commit takes care of that and also does some code beautification, mainly moving the relevant code into fd.c, which seems a much better place for it than xlog.c, and making sure that the conditional compilation for the pre_sync_fname pass has something to do with whether pg_flush_data works. I also relocated the call site in xlog.c down a few lines; it seems a bit silly to be doing this before ValidateXLOGDirectoryStructure(). The similar logic in initdb.c ought to be made to match this, but that change is noncritical and will be dealt with separately. Back-patch to all active branches, like the prior commit. Abhijit Menon-Sen and Tom Lane
2015-05-28Fix pg_get_functiondef() to print a function's LEAKPROOF property.Tom Lane
Seems to have been an oversight in the original leakproofness patch. Per report and patch from Jeevan Chalke. In passing, prettify some awkward leakproof-related code in AlterFunction.
2015-05-27Fix portability issue in isolationtester grammar.Tom Lane
specparse.y and specscanner.l used "string" as a token name. Now, bison likes to define each token name as a macro for the token code it assigns, which means those names are basically off-limits for any other use within the grammar file or included headers. So names as generic as "string" are dangerous. This is what was causing the recent failures on protosciurus: some versions of Solaris' sys/kstat.h use "string" as a field name. With late-model bison we don't see this problem because the token macros aren't defined till later (that is why castoroides didn't show the problem even though it's on the same machine). But protosciurus uses bison 1.875 which defines the token macros up front. This land mine has been there from day one; we'd have found it sooner except that protosciurus wasn't trying to run the isolation tests till recently. To fix, rename the token to "string_literal" which is hopefully less likely to collide with names used by system headers. Back-patch to all branches containing the isolation tests.
2015-05-24Rename pg_shdepend.c's typedef "objectType" to SharedDependencyObjectType.Tom Lane
The name objectType is widely used as a field name, and it's pure luck that this conflict has not caused pgindent to go crazy before. It messed up pg_audit.c pretty good though. Since pg_shdepend.c doesn't export this typedef and only uses it in three places, changing that seems saner than changing the field usages. Back-patch because we're contemplating using the union of all branch typedefs for future pgindent runs, so this won't fix anything if it stays the same in back branches.
2015-05-21Back-patch libpq support for TLS versions beyond v1.Tom Lane
Since 7.3.2, libpq has been coded in such a way that the only SSL protocol it would allow was TLS v1. That approach is looking increasingly obsolete. In commit 820f08cabdcbb899 we fixed it to allow TLS >= v1, but did not back-patch the change at the time, partly out of caution and partly because the question was confused by a contemporary server-side change to reject the now-obsolete SSL protocol v3. 9.4 has now been out long enough that it seems safe to assume the change is OK; hence, back-patch into 9.0-9.3. (I also chose to back-patch some relevant comments added by commit 326e1d73c476a0b5, but did *not* change the server behavior; hence, pre-9.4 servers will continue to allow SSL v3, even though no remotely modern client will request it.) Per gripe from Jan Bilek.
2015-05-19Revert error-throwing wrappers for the printf family of functions.Tom Lane
This reverts commit 16304a013432931e61e623c8d85e9fe24709d9ba, except for its changes in src/port/snprintf.c; as well as commit cac18a76bb6b08f1ecc2a85e46c9d2ab82dd9d23 which is no longer needed. Fujii Masao reported that the previous commit caused failures in psql on OS X, since if one exits the pager program early while viewing a query result, psql sees an EPIPE error from fprintf --- and the wrapper function thought that was reason to panic. (It's a bit surprising that the same does not happen on Linux.) Further discussion among the security list concluded that the risk of other such failures was far too great, and that the one-size-fits-all approach to error handling embodied in the previous patch is unlikely to be workable. This leaves us again exposed to the possibility of the type of failure envisioned in CVE-2015-3166. However, that failure mode is strictly hypothetical at this point: there is no concrete reason to believe that an attacker could trigger information disclosure through the supposed mechanism. In the first place, the attack surface is fairly limited, since so much of what the backend does with format strings goes through stringinfo.c or psprintf(), and those already had adequate defenses. In the second place, even granting that an unprivileged attacker could control the occurrence of ENOMEM with some precision, it's a stretch to believe that he could induce it just where the target buffer contains some valuable information. So we concluded that the risk of non-hypothetical problems induced by the patch greatly outweighs the security risks. We will therefore revert, and instead undertake closer analysis to identify specific calls that may need hardening, rather than attempt a universal solution. We have kept the portion of the previous patch that improved snprintf.c's handling of errors when it calls the platform's sprintf(). That seems to be an unalloyed improvement. Security: CVE-2015-3166
2015-05-19Fix off-by-one error in Assertion.Heikki Linnakangas
The point of the assertion is to ensure that the arrays allocated in stack are large enough, but the check was one item short. This won't matter in practice because MaxIndexTuplesPerPage is an overestimate, so you can't have that many items on a page in reality. But let's be tidy. Spotted by Anastasia Lubennikova. Backpatch to all supported versions, like the patch that added the assertion.
2015-05-18Don't MultiXactIdIsRunning when in recoveryAlvaro Herrera
In 9.1 and earlier, it is possible for index_getnext() to try to examine a heap buffer for possible HOT-prune when in recovery; this causes a problem when a multixact is found in a tuple's Xmax, because GetMultiXactIdMembers refuses to run when in recovery, raising an error: ERROR: cannot GetMultiXactIdMembers() during recovery This can be solved easily by having MultiXactIdIsRunning always return false when in recovery, which is reasonable because a HOT standby cannot acquire further tuple locks nor update/delete tuples. (Note: it doesn't look like this specific code path has a problem in 9.2, because instead of doing HeapTupleSatisfiesUpdate directly, heap_hot_search_buffer uses HeapTupleIsSurelyDead instead. Still, there may be other paths affected by the same bug, for instance in pgrowlocks, and the multixact code hasn't changed; so apply the same fix throughout.) Apply this fix to 9.0 through 9.2. In 9.3 the multixact code has been changed completely and is no longer subject to this problem. Per report from Marko Tiikkaja, https://www.postgresql.org/message-id/54EB3283.2080305@joh.to Analysis by Andres Freund
2015-05-18Stamp 9.2.11.Tom Lane
2015-05-18Fix error message in pre_sync_fname.Robert Haas
The old one didn't include %m anywhere, and required extra translation. Report by Peter Eisentraut. Fix by me. Review by Tom Lane.
2015-05-18Check return values of sensitive system library calls.Noah Misch
PostgreSQL already checked the vast majority of these, missing this handful that nearly cannot fail. If putenv() failed with ENOMEM in pg_GSS_recvauth(), authentication would proceed with the wrong keytab file. If strftime() returned zero in cache_locale_time(), using the unspecified buffer contents could lead to information exposure or a crash. Back-patch to 9.0 (all supported versions). Other unchecked calls to these functions, especially those in frontend code, pose negligible security concern. This patch does not address them. Nonetheless, it is always better to check return values whose specification provides for indicating an error. In passing, fix an off-by-one error in strftime_win32()'s invocation of WideCharToMultiByte(). Upon retrieving a value of exactly MAX_L10N_DATA bytes, strftime_win32() would overrun the caller's buffer by one byte. MAX_L10N_DATA is chosen to exceed the length of every possible value, so the vulnerable scenario probably does not arise. Security: CVE-2015-3166