summaryrefslogtreecommitdiff
path: root/src/backend/access/transam/multixact.c
AgeCommit message (Collapse)Author
2009-12-19Allow read only connections during recovery, known as Hot Standby.Simon Riggs
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record. New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far. This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required. Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit. Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-11-23Fix an old bug in multixact and two-phase commit. Prepared transactions canHeikki Linnakangas
be part of multixacts, so allocate a slot for each prepared transaction in the "oldest member" array in multixact.c. On PREPARE TRANSACTION, transfer the oldest member value from the current backends slot to the prepared xact slot. Also save and recover the value from the 2pc state file. The symptom of the bug was that after a transaction prepared, a shared lock still held by the prepared transaction was sometimes ignored by other transactions. Fix back to 8.1, where both 2PC and multixact were introduced.
2009-06-26Cleanup and code review for the patch that made bgwriter active duringTom Lane
archive recovery. Invent a separate state variable and inquiry function for XLogInsertAllowed() to clarify some tests and make the management of writing the end-of-recovery checkpoint less klugy. Fix several places that were incorrectly testing InRecovery when they should be looking at RecoveryInProgress or XLogInsertAllowed (because they will now be executed in the bgwriter not startup process). Clarify handling of bad LSNs passed to XLogFlush during recovery. Use a spinlock for setting/testing SharedRecoveryInProgress. Improve quite a lot of comments. Heikki and Tom
2009-01-20Add a new option to RestoreBkpBlocks() to indicate if a cleanup lock shouldHeikki Linnakangas
be used instead of the normal exclusive lock, and make WAL redo functions responsible for calling RestoreBkpBlocks(). They know better what kind of a lock they need. At the moment, this just moves things around with no functional change, but makes the hot standby patch that's under review cleaner.
2009-01-01Update copyright for 2009.Bruce Momjian
2008-08-01Add a few more DTrace probes to the backend.Alvaro Herrera
Robert Lor
2008-01-01Update copyrights in source tree to 2008.Bruce Momjian
2007-11-15pgindent run for 8.3.Bruce Momjian
2007-09-05Implement lazy XID allocation: transactions that do not modify any databaseTom Lane
rows will normally never obtain an XID at all. We already did things this way for subtransactions, but this patch extends the concept to top-level transactions. In applications where there are lots of short read-only transactions, this should improve performance noticeably; not so much from removal of the actual XID-assignments, as from reduction of overhead that's driven by the rate of XID consumption. We add a concept of a "virtual transaction ID" so that active transactions can be uniquely identified even if they don't have a regular XID. This is a much lighter-weight concept: uniqueness of VXIDs is only guaranteed over the short term, and no on-disk record is made about them. Florian Pflug, with some editorialization by Tom.
2007-08-01Support an optional asynchronous commit mode, in which we don't flush WALTom Lane
before reporting a transaction committed. Data consistency is still guaranteed (unlike setting fsync = off), but a crash may lose the effects of the last few transactions. Patch by Simon, some editorialization by Tom.
2007-01-05Update CVS HEAD for 2007 copyright. Back branches are typically notBruce Momjian
back-stamped for this.
2006-11-17Repair two related errors in heap_lock_tuple: it was failing to recognizeTom Lane
cases where we already hold the desired lock "indirectly", either via membership in a MultiXact or because the lock was originally taken by a different subtransaction of the current transaction. These cases must be accounted for to avoid needless deadlocks and/or inappropriate replacement of an exclusive lock with a shared lock. Per report from Clarence Gardner and subsequent investigation.
2006-10-04pgindent run for 8.2.Bruce Momjian
2006-07-20Don't try to truncate multixact SLRU files in checkpoints done during xlogTom Lane
recovery. In the first place, it doesn't work because slru's latest_page_number isn't set up yet (this is why we've been hearing reports of strange "apparent wraparound" log messages during crash recovery, but only from people who'd managed to advance their next-mxact counters some considerable distance from 0). In the second place, it seems a bit unwise to be throwing away data during crash recovery anwyway. This latter consideration convinces me to just disable truncation during recovery, rather than computing latest_page_number and pushing ahead.
2006-07-13Allow include files to compile own their own.Bruce Momjian
Strip unused include files out unused include files, and add needed includes to C files. The next step is to remove unused include files in C files.
2006-07-11Alphabetically order reference to include files, "G" - "M".Bruce Momjian
2006-03-24Arrange to emit a description of the current XLOG record as error contextTom Lane
when an error occurs during xlog replay. Also, replace the former risky 'write into a fixed-size buffer with no overflow detection' API for XLOG record description routines; use an expansible StringInfo instead. (The latter accounts for most of the patch bulk.) Qingqing Zhou
2006-03-05Update copyright for 2006. Update scripts.Bruce Momjian
2005-12-06Get rid of slru.c's hardwired insistence on a fixed number of slots perTom Lane
SLRU area. The number of slots is still a compile-time constant (someday we might want to change that), but at least it's a different constant for each SLRU area. Increase number of subtrans buffers to 32 based on experimentation with a heavily subtrans-bashing test case, and increase number of multixact member buffers to 16, since it's obviously silly for it not to be at least twice the number of multixact offset buffers.
2005-12-06Arrange for read-only accesses to SLRU page buffers to take only a sharedTom Lane
lock, not exclusive, if the desired page is already in memory. This can be demonstrated to be a significant win on the pg_subtrans cache when there is a large window of open transactions. It should be useful for pg_clog as well. I didn't try to make GetMultiXactIdMembers() use the code, as that would have taken some restructuring, and what with the local cache for multixact contents it probably wouldn't really make a difference. Per my recent proposal.
2005-11-22Re-run pgindent, fixing a problem where comment lines after a blankBruce Momjian
comment line where output as too long, and update typedefs for /lib directory. Also fix case where identifiers were used as variable names in the backend, but as typedefs in ecpg (favor the backend for indenting). Backpatch to 8.1.X.
2005-11-05Clean up representation of SLRU page state. This is the cleaner fixTom Lane
for the SLRU race condition that I posted a few days ago, but we decided not to use in 8.1 and older branches.
2005-10-28Reorder code so that we don't have to hold a critical section whileTom Lane
reserving SLRU space for a new MultiXact. The original coding would have treated out-of-disk-space as a PANIC condition, which is unnecessary.
2005-10-28Fix race condition in multixact code: it's possible to try to read aTom Lane
multixact's starting offset before the offset has been stored into the SLRU file. A simple fix would be to hold the MultiXactGenLock until the offset has been stored, but that looks like a big concurrency hit. Instead rely on knowledge that unset offsets will be zero, and loop when we see a zero. This requires a little extra hacking to ensure that zero is never a valid value for the offset. Problem reported by Matteo Beccati, fix ideas from Martijn van Oosterhout, Alvaro Herrera, and Tom Lane.
2005-10-15Standard pgindent run for 8.1.Bruce Momjian
2005-08-20Convert the arithmetic for shared memory size calculation from 'int'Tom Lane
to 'Size' (that is, size_t), and install overflow detection checks in it. This allows us to remove the former arbitrary restrictions on NBuffers etc. It won't make any difference in a 32-bit machine, but in a 64-bit machine you could theoretically have terabytes of shared buffers. (How efficiently we could manage 'em remains to be seen.) Similarly, num_temp_buffers, work_mem, and maintenance_work_mem can be set above 2Gb on a 64-bit machine. Original patch from Koichi Suzuki, additional work by moi.
2005-08-20Make GetMultiXactIdMembers() a public function.Tatsuo Ishii
2005-08-01Add NOWAIT option to SELECT FOR UPDATE/SHARE.Tom Lane
Original patch by Hans-Juergen Schoenig, revisions by Karel Zak and Tom Lane.
2005-06-08Change WAL-logging scheme for multixacts to be more like regularTom Lane
transaction IDs, rather than like subtrans; in particular, the information now survives a database restart. Per previous discussion, this is essential for PITR log shipping and for 2PC.
2005-05-19Split the shared-memory array of PGPROC pointers out of the sinvalTom Lane
communication structure, and make it its own module with its own lock. This should reduce contention at least a little, and it definitely makes the code seem cleaner. Per my recent proposal.
2005-05-07Fix case in which a debug printout would print already-pfreed data.Tom Lane
2005-05-03Clean up MultiXactIdExpand's API by separating out the case where weTom Lane
are creating a new MultiXactId from two regular XIDs. The original coding was unnecessarily complicated and didn't save any code anyway.
2005-04-28Implement sharable row-level locks, and use them for foreign key referencesTom Lane
to eliminate unnecessary deadlocks. This commit adds SELECT ... FOR SHARE paralleling SELECT ... FOR UPDATE. The implementation uses a new SLRU data structure (managed much like pg_subtrans) to represent multiple- transaction-ID sets. When more than one transaction is holding a shared lock on a particular row, we create a MultiXactId representing that set of transactions and store its ID in the row's XMAX. This scheme allows an effectively unlimited number of row locks, just as we did before, while not costing any extra overhead except when a shared lock actually has to be shared. Still TODO: use the regular lock manager to control the grant order when multiple backends are waiting for a row lock. Alvaro Herrera and Tom Lane.