summaryrefslogtreecommitdiff
path: root/src/include
AgeCommit message (Collapse)Author
2010-12-13Tag 8.2.19.REL8_2_19Marc G. Fournier
2010-12-08Force default wal_sync_method to be fdatasync on Linux.Tom Lane
Recent versions of the Linux system header files cause xlogdefs.h to believe that open_datasync should be the default sync method, whereas formerly fdatasync was the default on Linux. open_datasync is a bad choice, first because it doesn't actually outperform fdatasync (in fact the reverse), and second because we try to use O_DIRECT with it, causing failures on certain filesystems (e.g., ext4 with data=journal option). This part of the patch is largely per a proposal from Marti Raudsepp. More extensive changes are likely to follow in HEAD, but this is as much change as we want to back-patch. Also clean up confusing code and incorrect documentation surrounding the fsync_writethrough option. Those changes shouldn't result in any actual behavioral change, but I chose to back-patch them anyway to keep the branches looking similar in this area. In 9.0 and HEAD, also do some copy-editing on the WAL Reliability documentation section. Back-patch to all supported branches, since any of them might get used on modern Linux versions.
2010-11-16The GiST scan algorithm uses LSNs to detect concurrent pages splits, butHeikki Linnakangas
temporary indexes are not WAL-logged. We used a constant LSN for temporary indexes, on the assumption that we don't need to worry about concurrent page splits in temporary indexes because they're only visible to the current session. But that assumption is wrong, it's possible to insert rows and split pages in the same session, while a scan is in progress. For example, by opening a cursor and fetching some rows, and INSERTing new rows before fetching some more. Fix by generating fake increasing LSNs, used in place of real LSNs in temporary GiST indexes.
2010-10-01Tag 8.2.18REL8_2_18Marc G. Fournier
2010-09-22Some more gitignore cleanups: cover contrib and PL regression test outputs.Tom Lane
Also do some further work in the back branches, where quite a bit wasn't covered by Magnus' original back-patch.
2010-09-22Convert cvsignore to gitignore, and add .gitignore for build targets.Magnus Hagander
2010-09-02Fix up flushing of composite-type typcache entries to be driven directly byTom Lane
SI invalidation events, rather than indirectly through the relcache. In the previous coding, we had to flush a composite-type typcache entry whenever we discarded the corresponding relcache entry. This caused problems at least when testing with RELCACHE_FORCE_RELEASE, as shown in recent report from Jeff Davis, and might result in real-world problems given the kind of unexpected relcache flush that that test mechanism is intended to model. The new coding decouples relcache and typcache management, which is a good thing anyway from a structural perspective. The cost is that we have to search the typcache linearly to find entries that need to be flushed. There are a couple of ways we could avoid that, but at the moment it's not clear it's worth any extra trouble, because the typcache contains very few entries in typical operation. Back-patch to 8.2, the same as some other recent fixes in this general area. The patch could be carried back to 8.0 with some additional work, but given that it's only hypothetical whether we're fixing any problem observable in the field, it doesn't seem worth the work now.
2010-08-30Back-port into 8.2 an old fix to ensure that BYTE_ORDER gets setTom Lane
correctly on 64-bit Intel Solaris. Per my proposal yesterday, 8.2 is where we will start considering this platform supported. While this patch itself could easily go into older branches, there's not a huge amount of point unless we also make some significantly-more-invasive changes in the spinlock support.
2010-07-30Improved version of patch to protect pg_get_expr() against misuse:Tom Lane
look through join alias Vars to avoid breaking join queries, and move the test to someplace where it will catch more possible ways of calling a function. We still ought to throw away the whole thing in favor of a data-type-based solution, but that's not feasible in the back branches. Completion of back-port of my patch of yesterday.
2010-07-28Fix potential failure when hashing the output of a subplan that producesTom Lane
a pass-by-reference datatype with a nontrivial projection step. We were using the same memory context for the projection operation as for the temporary context used by the hashtable routines in execGrouping.c. However, the hashtable routines feel free to reset their temp context at any time, which'd lead to destroying input data that was still needed. Report and diagnosis by Tao Ma. Back-patch to 8.1, where the problem was introduced by the changes that allowed us to work with "virtual" tuples instead of materializing intermediate tuple values everywhere. The earlier code looks quite similar, but it doesn't suffer the problem because the data gets copied into another context as a result of having to materialize ExecProject's output tuple.
2010-07-23Backpatch reservation of shared memory region during backend startup onMagnus Hagander
Windows, so that memory allocated by starting third party DLLs doesn't end up conflicting. The same functionality has been in 8.3 and 8.4 for almost a year, and seems to have solved some of the more common shared memory errors on Windows.
2010-07-05The previous fix in CVS HEAD and 8.4 for handling the case where a cursorHeikki Linnakangas
being used in a PL/pgSQL FOR loop is closed was inadequate, as Tom Lane pointed out. The bug affects FOR statement variants too, because you can close an implicitly created cursor too by guessing the "<unnamed portal X>" name created for it. To fix that, "pin" the portal to prevent it from being dropped while it's being used in a PL/pgSQL FOR loop. Backpatch all the way to 7.4 which is the oldest supported version.
2010-05-14tag 8.2.17REL8_2_17Marc G. Fournier
2010-04-01Don't pass an invalid file handle to dup2(). That causes a crash onHeikki Linnakangas
Windows, thanks to a feature in CRT called Parameter Validation. Backpatch to 8.2, which is the oldest version supported on Windows. In 8.2 and 8.3 also backpatch the earlier change to use DEVNULL instead of NULL_DEV #define for a /dev/null-like device. NULL_DEV was hard-coded to "/dev/null" regardless of platform, which didn't work on Windows, while DEVNULL works on all platforms. Restarting syslogger didn't work on Windows on versions 8.3 and below because of that.
2010-03-25Prevent ALTER USER f RESET ALL from removing the settings that were put thereAlvaro Herrera
by a superuser -- "ALTER USER f RESET setting" already disallows removing such a setting. Apply the same treatment to ALTER DATABASE d RESET ALL when run by a database owner that's not superuser.
2010-03-12tag 8.2.16REL8_2_16Marc G. Fournier
2009-12-10tag 8.2.15REL8_2_15Marc G. Fournier
2009-12-09Prevent indirect security attacks via changing session-local state withinTom Lane
an allegedly immutable index function. It was previously recognized that we had to prevent such a function from executing SET/RESET ROLE/SESSION AUTHORIZATION, or it could trivially obtain the privileges of the session user. However, since there is in general no privilege checking for changes of session-local state, it is also possible for such a function to change settings in a way that might subvert later operations in the same session. Examples include changing search_path to cause an unexpected function to be called, or replacing an existing prepared statement with another one that will execute a function of the attacker's choosing. The present patch secures VACUUM, ANALYZE, and CREATE INDEX/REINDEX against these threats, which are the same places previously deemed to need protection against the SET ROLE issue. GUC changes are still allowed, since there are many useful cases for that, but we prevent security problems by forcing a rollback of any GUC change after completing the operation. Other cases are handled by throwing an error if any change is attempted; these include temp table creation, closing a cursor, and creating or deleting a prepared statement. (In 7.4, the infrastructure to roll back GUC changes doesn't exist, so we settle for rejecting changes of "search_path" in these contexts.) Original report and patch by Gurjeet Singh, additional analysis by Tom Lane. Security: CVE-2009-4136
2009-12-03Fix bug in temporary file management with subtransactions. A cursor openedHeikki Linnakangas
in a subtransaction stays open even if the subtransaction is aborted, so any temporary files related to it must stay alive as well. With the patch, we use ResourceOwners to track open temporary files and don't automatically close them at subtransaction end (though in the normal case temporary files are registered with the subtransaction resource owner and will therefore be closed). At end of top transaction, we still check that there's no temporary files marked as close-at-end-of-transaction open, but that's now just a debugging cross-check as the resource owner cleanup should've closed them already.
2009-11-23Fix an old bug in multixact and two-phase commit. Prepared transactions canHeikki Linnakangas
be part of multixacts, so allocate a slot for each prepared transaction in the "oldest member" array in multixact.c. On PREPARE TRANSACTION, transfer the oldest member value from the current backends slot to the prepared xact slot. Also save and recover the value from the 2pc state file. The symptom of the bug was that after a transaction prepared, a shared lock still held by the prepared transaction was sometimes ignored by other transactions. Fix back to 8.1, where both 2PC and multixact were introduced.
2009-11-20Revert backpatch of inheritable-ACE patch for Win32, since it brokeMagnus Hagander
compatibility with pre-Windows 2000 versions.
2009-11-15Backpatch the inheritable-ACE patch for Win32 to 8.2 as well, exceptMagnus Hagander
for the pg_regress part which did not support admin execution in 8.2.
2009-11-10Fix longstanding problems in VACUUM caused by untimely interruptionsAlvaro Herrera
In VACUUM FULL, an interrupt after the initial transaction has been recorded as committed can cause postmaster to restart with the following error message: PANIC: cannot abort transaction NNNN, it was already committed This problem has been reported many times. In lazy VACUUM, an interrupt after the table has been truncated by lazy_truncate_heap causes other backends' relcache to still point to the removed pages; this can cause future INSERT and UPDATE queries to error out with the following error message: could not read block XX of relation 1663/NNN/MMMM: read only 0 of 8192 bytes The window to this race condition is extremely narrow, but it has been seen in the wild involving a cancelled autovacuum process. The solution for both problems is to inhibit interrupts in both operations until after the respective transactions have been committed. It's not a complete solution, because the transaction could theoretically be aborted by some other error, but at least fixes the most common causes of both problems.
2009-09-04Tag 8.2.14REL8_2_14Marc G. Fournier
2009-09-03Disallow RESET ROLE and RESET SESSION AUTHORIZATION inside security-definerTom Lane
functions. This extends the previous patch that forbade SETting these variables inside security-definer functions. RESET is equally a security hole, since it would allow regaining privileges of the caller; furthermore it can trigger Assert failures and perhaps other internal errors, since the code is not expecting these variables to change in such contexts. The previous patch did not cover this case because assign hooks don't really have enough information, so move the responsibility for preventing this into guc.c. Problem discovered by Heikki Linnakangas. Security: no CVE assigned yet, extends CVE-2007-6600
2009-07-20Install src/include/utils/fmgroids.h on VPATH builds too.Alvaro Herrera
The original coding was not dealing specially with this file being a symlink, with the end result that it was not installed in VPATH builds. Oddly enough, the clean target does know about it ...
2009-06-05GIN's ItemPointerIsMin, ItemPointerIsMax, and ItemPointerIsLossyPage macrosTom Lane
should use GinItemPointerGetBlockNumber/GinItemPointerGetOffsetNumber, not ItemPointerGetBlockNumber/ItemPointerGetOffsetNumber, because the latter will Assert() on ip_posid == 0, ie a "Min" pointer. (Thus, ItemPointerIsMin has never worked at all, but it seems unused at present.) I'm not certain that the case can occur in normal functioning, but it's blowing up on me while investigating Tatsuo-san's data corruption problem. In any case it seems like a problem waiting to bite someone. Back-patch just in case this really is a problem for somebody in the field.
2009-03-30Fix an oversight in the support for storing/retrieving "minimal tuples" inTom Lane
TupleTableSlots. We have functions for retrieving a minimal tuple from a slot after storing a regular tuple in it, or vice versa; but these were implemented by converting the internal storage from one format to the other. The problem with that is it invalidates any pass-by-reference Datums that were already fetched from the slot, since they'll be pointing into the just-freed version of the tuple. The known problem cases involve fetching both a whole-row variable and a pass-by-reference value from a slot that is fed from a tuplestore or tuplesort object. The added regression tests illustrate some simple cases, but there may be other failure scenarios traceable to the same bug. Note that the added tests probably only fail on unpatched code if it's built with --enable-cassert; otherwise the bug leads to fetching from freed memory, which will not have been overwritten without additional conditions. Fix by allowing a slot to contain both formats simultaneously; which turns out not to complicate the logic much at all, if anything it seems less contorted than before. Back-patch to 8.2, where minimal tuples were introduced.
2009-03-24Install a search tree depth limit in GIN bulk-insert operations, to preventTom Lane
them from degrading badly when the input is sorted or nearly so. In this scenario the tree is unbalanced to the point of becoming a mere linked list, so insertions become O(N^2). The easiest and most safely back-patchable solution is to stop growing the tree sooner, ie limit the growth of N. We might later consider a rebalancing tree algorithm, but it's not clear that the benefit would be worth the cost and complexity. Per report from Sergey Burladyan and an earlier complaint from Heikki. Back-patch to 8.2; older versions didn't have GIN indexes.
2009-03-13tag 8.2.13REL8_2_13Marc G. Fournier
2009-03-02When we are in error recursion trouble, arrange to suppress translation andTom Lane
encoding conversion of any elog/ereport message being sent to the frontend. This generalizes a patch that I put in last October, which suppressed translation of only specific messages known to be associated with recursive can't-translate-the-message behavior. As shown in bug #4680, we need a more general answer in order to have some hope of coping with broken encoding conversion setups. This approach seems a good deal less klugy anyway. Patch in all supported branches.
2009-02-24Repair a longstanding bug in CLUSTER and the rewriting variants of ALTERTom Lane
TABLE: if the command is executed by someone other than the table owner (eg, a superuser) and the table has a toast table, the toast table's pg_type row ends up with the wrong typowner, ie, the command issuer not the table owner. This is quite harmless for most purposes, since no interesting permissions checks consult the pg_type row. However, it could lead to unexpected failures if one later tries to drop the role that issued the command (in 8.1 or 8.2), or strange warnings from pg_dump afterwards (in 8.3 and up, which will allow the DROP ROLE because we don't create a "redundant" owner dependency for table rowtypes). Problem identified by Cott Lang. Back-patch to 8.1. The problem is actually far older --- the CLUSTER variant can be demonstrated in 7.0 --- but it's mostly cosmetic before 8.1 because we didn't track ownership dependencies before 8.1. Also, fixing it before 8.1 would require changing the call signature of heap_create_with_catalog(), which seems to carry a nontrivial risk of breaking add-on modules.
2009-01-30tag 8.2.12REL8_2_12Marc G. Fournier
2009-01-29Replace argument-checking Asserts with regular test-and-elog checks in allTom Lane
encoding conversion functions. These are not can't-happen cases because it's possible to create a conversion with the wrong conversion function for the specified encoding pair. That would lead to an Assert crash in an Assert-enabled build, or incorrect conversion otherwise, neither of which is desirable. This would be a DOS issue if production databases were customarily built with asserts enabled, but fortunately that's not so. Per an observation by Heikki. Back-patch to all supported branches.
2009-01-07Insert conditional SPI_push/SPI_pop calls into InputFunctionCall,Tom Lane
OutputFunctionCall, and friends. This allows SPI-using functions to invoke datatype I/O without concern for the possibility that a SPI-using function will be called (which could be either the I/O function itself, or a function used in a domain check constraint). It's a tad ugly, but not nearly as ugly as what'd be needed to make this work via retail insertion of push/pop operations in all the PLs. This reverts my patch of 2007-01-30 that inserted some retail SPI_push/pop calls into plpgsql; that approach only fixed plpgsql, and not any other PLs. But the other PLs have the issue too, as illustrated by a recent gripe from Christian Schröder. Back-patch to 8.2, which is as far back as this solution will work. It's also as far back as we need to worry about the domain-constraint case, since earlier versions did not attempt to check domain constraints within datatype input. I'm not aware of any old I/O functions that use SPI themselves, so this should be sufficient for a back-patch.
2008-12-13Fix failure to ensure that a snapshot is available to datatype input functionsTom Lane
when they are invoked by the parser. We had been setting up a snapshot at plan time but really it needs to be done earlier, before parse analysis. Per report from Dmitry Koterov. Also fix two related problems discovered while poking at this one: exec_bind_message called datatype input functions without establishing a snapshot, and SET CONSTRAINTS IMMEDIATE could call trigger functions without establishing a snapshot. Backpatch to 8.2. The underlying problem goes much further back, but it is masked in 8.1 and before because we didn't attempt to invoke domain check constraints within datatype input. It would only be exposed if a C-language datatype input function used the snapshot; which evidently none do, or we'd have heard complaints sooner. Since this code has changed a lot over time, a back-patch is hardly risk-free, and so I'm disinclined to patch further than absolutely necessary.
2008-12-01Fix an oversight in the code that makes transitive-equality deductions fromTom Lane
outer join clauses. Given, say, ... from a left join b on a.a1 = b.b1 where a.a1 = 42; we'll deduce a clause b.b1 = 42 and then mark the original join clause redundant (we can't remove it completely for reasons I don't feel like squeezing into this log entry). However the original implementation of that wasn't bulletproof, because clause_selectivity() wouldn't honor this_selec if given nonzero varRelid --- which in practice meant that it worked as desired *except* when considering index scan quals. Which resulted in bogus underestimation of the size of the indexscan result for an inner indexscan in an outer join, and consequently a possibly bad choice of indexscan vs. bitmap scan. Fix by introducing an explicit test into clause_selectivity(). Also, to make sure we don't trigger that test in corner cases, change the convention to be that this_selec > 1, not this_selec = 1, means it's been marked redundant. Per trouble report from Scara Maccai. Back-patch to 8.2, where the problem was introduced.
2008-12-01Ensure that the contents of a holdable cursor don't depend on out-of-lineTom Lane
toasted values, since those could get dropped once the cursor's transaction is over. Per bug #4553 from Andrew Gierth. Back-patch as far as 8.1. The bug actually exists back to 7.4 when holdable cursors were introduced, but this patch won't work before 8.1 without significant adjustments. Given the lack of field complaints, it doesn't seem worth the work (and risk of introducing new bugs) to try to make a patch for the older branches.
2008-11-11Get rid of adjust_appendrel_attr_needed(), which has been broken ever sinceTom Lane
we extended the appendrel mechanism to support UNION ALL optimization. The reason nobody noticed was that we are not actually using attr_needed data for appendrel children; hence it seems more reasonable to rip it out than fix it. Back-patch to 8.2 because an Assert failure is possible in corner cases. Per examination of an example from Jim Nasby. In HEAD, also get rid of AppendRelInfo.col_mappings, which is quite inadequate to represent UNION ALL situations; depend entirely on translated_vars instead.
2008-10-31tag 8.2.11REL8_2_11Marc G. Fournier
2008-10-27Install a more robust solution for the problem of infinite error-processingTom Lane
recursion when we are unable to convert a localized error message to the client's encoding. We've been over this ground before, but as reported by Ibrar Ahmed, it still didn't work in the case of conversion failures for the conversion-failure message itself :-(. Fix by installing a "circuit breaker" that disables attempts to localize this message once we get into recursion trouble. Patch all supported branches, because it is in fact broken in all of them; though I had to add some missing translations to the older branches in order to expose the failure in the particular test case I was using.
2008-10-22Fix GiST's killing tuple: GISTScanOpaque->curpos wasn'tTeodor Sigaev
correctly set. As result, killtuple() marks as dead wrong tuple on page. Bug was introduced by me while fixing possible duplicates during GiST index scan.
2008-09-19tag for 8.2.10REL8_2_10Marc G. Fournier
2008-09-16Widen the nLocks counts in local lock tables from int to int64. ThisTom Lane
forestalls potential overflow when the same table (or other object, but usually tables) is accessed by very many successive queries within a single transaction. Per report from Michael Milligan. Back-patch to 8.0, which is as far back as the patch conveniently applies. There have been no reports of overflow in pre-8.3 releases, but clearly the risk existed all along. (Michael's report suggests that 8.3 may consume lock counts faster than prior releases, but with no test case to look at it's hard to be sure about that. Widening the counts seems a good future-proofing measure in any event.)
2008-08-23Fix possible duplicate tuples while GiST scan. Now page is processedTeodor Sigaev
at once and ItemPointers are collected in memory. Remove tuple's killing by killtuple() if tuple was moved to another page - it could produce unaceptable overhead. Backpatch up to 8.1 because the bug was introduced by GiST's concurrency support.
2008-08-14Fix pull_up_simple_union_all to copy all rtable entries from child subquery toHeikki Linnakangas
parent, not only those with RangeTblRefs. We need them in ExecCheckRTPerms. Report by Brendan O'Shea. Back-patch to 8.2, where pull_up_simple_union_all was introduced.
2008-06-08Stamp 8.2.9 (except for configure.in/configure)Tom Lane
2008-06-05Stamp 8.2.8 (except for configure.in/configure)Tom Lane
2008-05-27Back-patch the 8.3 fix that prohibits TRUNCATE, CLUSTER, and REINDEX when theTom Lane
current transaction has any open references to the target relation or index (implying it has an active query using the relation). Also back-patch the 8.2 fix that prohibits TRUNCATE and CLUSTER when there are pending AFTER-trigger events. Per suggestion from Heikki.
2008-04-22Fix using too many LWLocks bug, reported by Craig RingerTeodor Sigaev
<craig@postnewspapers.com.au>. It was my mistake, I missed limitation of number of held locks, now GIN doesn't use continiuous locks, but still hold buffers pinned to prevent interference with vacuum's deletion algorithm.