summaryrefslogtreecommitdiff
path: root/src/backend/access/transam/xlog.c
AgeCommit message (Collapse)Author
2008-05-13Don't try to close negative file descriptors, since this can causeMagnus Hagander
crashes on certain platforms. In particular, the MSVC runtime is known to do this. Fixes bug #4162, reported and diagnosed by Javier Pimas
2008-04-17Repair two places where SIGTERM exit could leave shared memory stateTom Lane
corrupted. (Neither is very important if SIGTERM is used to shut down the whole database cluster together, but there's a problem if someone tries to SIGTERM individual backends.) To do this, introduce new infrastructure macros PG_ENSURE_ERROR_CLEANUP/PG_END_ENSURE_ERROR_CLEANUP that take care of transiently pushing an on_shmem_exit cleanup hook. Also use this method for createdb cleanup --- that wasn't a shared-memory-corruption problem, but SIGTERM abort of createdb could leave orphaned files lying around. Backpatch as far as 8.2. The shmem corruption cases don't exist in 8.1, and the createdb usage doesn't seem important enough to risk backpatching further.
2007-09-29Make archive recovery always start a new timeline, rather than only when aTom Lane
recovery stop time was used. This avoids a corner-case risk of trying to overwrite an existing archived copy of the last WAL segment, and seems simpler and cleaner all around than the original definition. Per example from Jon Colverson and subsequent analysis by Simon.
2007-08-04Suppress time zone name (%Z) when logging timestamps in xlog.c startupTom Lane
on Windows. This is yet another manifestation of the problem that Windows returns time zone names that may be in a different encoding than we are using. I've put a better solution in HEAD, but the back branches need a simple patch. Per report from Hiroshi Saito.
2006-11-30Minor adjustments to make failures in startup/shutdown behave more cleanly.Tom Lane
StartupXLOG and ShutdownXLOG no longer need to be critical sections, because in all contexts where they are invoked, elog(ERROR) would be translated to elog(FATAL) anyway. (One change in bgwriter.c is needed to make this true: set ExitOnAnyError before trying to exit. This is a good fix anyway since the existing code would have gone into an infinite loop on elog(ERROR) during shutdown.) That avoids a misleading report of PANIC during semi-orderly failures. Modify the postmaster to include the startup process in the set of processes that get SIGTERM when a fast shutdown is requested, and also fix it to not try to restart the bgwriter if the bgwriter fails while trying to write the shutdown checkpoint. Net result is that "pg_ctl stop -m fast" does something reasonable for a system in warm standby mode, and so should Unix system shutdown (ie, universal SIGTERM). Per gripe from Stephen Harris and some corner-case testing of my own.
2006-11-21On systems that have setsid(2) (which should be just about everything exceptTom Lane
Windows), arrange for each postmaster child process to be its own process group leader, and deliver signals SIGINT, SIGTERM, SIGQUIT to the whole process group not only the direct child process. This provides saner behavior for archive and recovery scripts; in particular, it's possible to shut down a warm-standby recovery server using "pg_ctl stop -m immediate", since delivery of SIGQUIT to the startup subprocess will result in killing the waiting recovery_command. Also, this makes Query Cancel and statement_timeout apply to scripts being run from backends via system(). (There is no support in the core backend for that, but it's widely done using untrusted PLs.) Per gripe from Stephen Harris and subsequent discussion.
2006-11-16String fixPeter Eisentraut
2006-11-10Clean up some misleading references to %p being a full path, per Simon.Tom Lane
2006-11-08Change Windows rename and unlink substitutes so that they time out afterTom Lane
30 seconds instead of retrying forever. Also modify xlog.c so that if it fails to rename an old xlog segment up to a future slot, it will unlink the segment instead. Per discussion of bug #2712, in which it became apparent that Windows can handle unlinking a file that's being held open, but not renaming it.
2006-11-05Fix recently-understood problems with handling of XID freezing, particularlyTom Lane
in PITR scenarios. We now WAL-log the replacement of old XIDs with FrozenTransactionId, so that such replacement is guaranteed to propagate to PITR slave databases. Also, rather than relying on hint-bit updates to be preserved, pg_clog is not truncated until all instances of an XID are known to have been replaced by FrozenTransactionId. Add new GUC variables and pg_autovacuum columns to allow management of the freezing policy, so that users can trade off the size of pg_clog against the amount of freezing work done. Revise the already-existing code that forces autovacuum of tables approaching the wraparound point to make it more bulletproof; also, revise the autovacuum logic so that anti-wraparound vacuuming is done per-table rather than per-database. initdb forced because of changes in pg_class, pg_database, and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-10-18Add some code to CREATE DATABASE to check for pre-existing subdirectoriesTom Lane
that conflict with the OID that we want to use for the new database. This avoids the risk of trying to remove files that maybe we shouldn't remove. Per gripe from Jon Lapham and subsequent discussion of 27-Sep.
2006-10-06Message style improvementsPeter Eisentraut
2006-10-04pgindent run for 8.2.Bruce Momjian
2006-08-21Make the server track an 'XID epoch', that is, maintain higher-order bitsTom Lane
of the transaction ID counter. Nothing is done with the epoch except to store it in checkpoint records, but this provides a foundation with which add-on code can pretend that XIDs never wrap around. This is a severely trimmed and rewritten version of the xxid patch submitted by Marko Kreen. Per discussion, the epoch counter seems the only part of xxid that really needs to be in the core server.
2006-08-17Implement archive_timeout feature to force xlog file switches to occur no moreTom Lane
than N seconds apart. This allows a simple, if not very high performance, means of guaranteeing that a PITR archive is no more than N seconds behind real time. Also make pg_current_xlog_location return the WAL Write pointer, add pg_current_xlog_insert_location to return the Insert pointer, and fix pg_xlogfile_name_offset to return its results as a two-element record instead of a smashed-together string, as per recent discussion. Simon Riggs
2006-08-07Make recovery from WAL be restartable, by executing a checkpoint-likeTom Lane
operation every so often. This improves the usefulness of PITR log shipping for hot standby: formerly, if the standby server crashed, it was necessary to restart it from the last base backup and replay all the WAL since then. Now it will only need to reread about the same amount of WAL as the master server would. The behavior might also come in handy during a long PITR replay sequence. Simon Riggs, with some editorialization by Tom Lane.
2006-08-06Add support for forcing a switch to a new xlog file; cause such a switchTom Lane
to happen automatically during pg_stop_backup(). Add some functions for interrogating the current xlog insertion point and for easily extracting WAL filenames from the hex WAL locations displayed by pg_stop_backup and friends. Simon Riggs with some editorialization by Tom Lane.
2006-07-30Modify snapshot definition so that lazy vacuums are ignored by otherAlvaro Herrera
vacuums. This allows a OLTP-like system with big tables to continue regular vacuuming on small-but-frequently-updated tables while the big tables are being vacuumed. Original patch from Hannu Krossing, rewritten by Tom Lane and updated by me.
2006-07-14Remove 576 references of include files that were not needed.Bruce Momjian
2006-07-13Allow include files to compile own their own.Bruce Momjian
Strip unused include files out unused include files, and add needed includes to C files. The next step is to remove unused include files in C files.
2006-06-27Put #ifdef NOT_USED around posix_fadvise call. We may want to resurrectTom Lane
this someday, but right now it seems that posix_fadvise is immature to the point of being broken on many platforms ... and we don't have any benchmark evidence proving it's worth spending time on.
2006-06-22pg_stop_backup was calling XLogArchiveNotify() twice for the newly createdTom Lane
backup history file. Bug introduced by the 8.1 change to make pg_stop_backup delete older history files. Per report from Masao Fujii.
2006-06-18Don't try to call posix_fadvise() unless <fcntl.h> supplies a declarationTom Lane
for it. Hopefully will fix core dump evidenced by some buildfarm members since fadvise patch went in. The actual definition of the function is not ABI-compatible with compiler's default assumption in the absence of any declaration, so it's clearly unsafe to try to call it without seeing a declaration.
2006-06-16Test for POSIX_FADV_DONTNEED to use posix_fadvise().Bruce Momjian
2006-06-15Use posix_fadvise() to avoid kernel caching of WAL contents on WAL fileBruce Momjian
close. ITAGAKI Takahiro
2006-04-20Ensure that we validate the page header of the first page of a WAL fileTom Lane
whenever we start to read within that file. The first page carries extra identification information that really ought to be checked, but as the code stood, this was only checked when we switched sequentially into a new WAL file, or if by chance the starting checkpoint record was within the first page. This patch ensures that we will detect bogus 'long header' information before we start replaying the WAL sequence.
2006-04-17Fix the torn-page hazard for PITR base backups by forcing full page writesTom Lane
to occur between pg_start_backup() and pg_stop_backup(), even if the GUC setting full_page_writes is OFF. Per discussion, doing this in combination with the already-existing checkpoint during pg_start_backup() should ensure safety against partial page updates being included in the backup. We do not have to force full page writes to occur during normal PITR operation, as I had first feared.
2006-04-14Make the world safe for full_page_writes. Allow XLOG records that try toTom Lane
update no-longer-existing pages to fall through as no-ops, but make a note of each page number referenced by such records. If we don't see a later XLOG entry dropping the table or truncating away the page, complain at the end of XLOG replay. Since this fixes the known failure mode for full_page_writes = off, revert my previous band-aid patch that disabled that GUC variable.
2006-04-05Add a field to the first page of each WAL file to indicate theTom Lane
XLOG_BLCKSZ. This ought to help in preventing configuration mismatch problems if anyone tries to ship PITR files between servers compiled with different XLOG_BLCKSZ settings. Simon Riggs
2006-04-04Don't use BLCKSZ for the physical length of the pg_control file, butTom Lane
instead a dedicated symbol. This probably makes no functional difference for likely values of BLCKSZ, but it makes the intent clearer. Simon Riggs, minor editorialization by Tom Lane.
2006-04-03Define a separately configurable XLOG_BLCKSZ symbol for the page sizeTom Lane
used within WAL files. Historically this was the same as the data file BLCKSZ, but there's no necessary connection, and it's possible that performance gains might ensue from reducing XLOG_BLCKSZ. In any case distinguishing two symbols should improve code clarity. This commit does not actually change the page size, only provide the infrastructure to make it possible to do so. initdb forced because of addition of a field to pg_control. Mark Wong, with some help from Simon Riggs and Tom Lane.
2006-03-31Clean up WAL/buffer interactions as per my recent proposal. Get rid of theTom Lane
misleadingly-named WriteBuffer routine, and instead require routines that change buffer pages to call MarkBufferDirty (which does exactly what it says). We also require that they do so before calling XLogInsert; this takes care of the synchronization requirement documented in SyncOneBuffer. Note that because bufmgr takes the buffer content lock (in shared mode) while writing out any buffer, it doesn't matter whether MarkBufferDirty is executed before the buffer content change is complete, so long as the content change is completed before releasing exclusive lock on the buffer. So it's OK to set the dirtybit before we fill in the LSN. This eliminates the former kluge of needing to set the dirtybit in LockBuffer. Aside from making the code more transparent, we can also add some new debugging assertions, in particular that the caller of MarkBufferDirty must hold the buffer content lock, not merely a pin.
2006-03-29Clean up and document the API for XLogOpenRelation and XLogReadBuffer.Tom Lane
This commit doesn't make much functional change, but it does eliminate some duplicated code --- for instance, PageIsNew tests are now done inside XLogReadBuffer rather than by each caller. The GIST xlog code still needs a lot of love, but I'll worry about that separately.
2006-03-28Disable full_page_writes, because turning it off risks causing crash-recoveryTom Lane
failures even when the hardware and OS did nothing wrong. Per recent analysis of a problem report from Alex Bahdushka. For the moment I've just diked out the test of the parameter, rather than removing the GUC infrastructure and documentation, in case we conclude that there's something salvageable there. There seems no chance of it being resurrected in the 8.1 branch though.
2006-03-24Arrange to emit a description of the current XLOG record as error contextTom Lane
when an error occurs during xlog replay. Also, replace the former risky 'write into a fixed-size buffer with no overflow detection' API for XLOG record description routines; use an expansible StringInfo instead. (The latter accounts for most of the patch bulk.) Qingqing Zhou
2006-03-05Update copyright for 2006. Update scripts.Bruce Momjian
2006-01-11Cosmetic code cleanup: fix a bunch of places that used "return (expr);"Neil Conway
rather than "return expr;" -- the latter style is used in most of the tree. I kept the parentheses when they were necessary or useful because the return expression was complex.
2005-12-29Get rid of the SpinLockAcquire/SpinLockAcquire_NoHoldoff distinctionTom Lane
in favor of having just one set of macros that don't do HOLD/RESUME_INTERRUPTS (hence, these correspond to the old SpinLockAcquire_NoHoldoff case). Given our coding rules for spinlock use, there is no reason to allow CHECK_FOR_INTERRUPTS to be done while holding a spinlock, and also there is no situation where ImmediateInterruptOK will be true while holding a spinlock. Therefore doing HOLD/RESUME_INTERRUPTS while taking/releasing a spinlock is just a waste of cycles. Qingqing Zhou and Tom Lane.
2005-12-28Arrange to set the LC_XXX environment variables to match our localeTom Lane
setup. This protects against undesired changes in locale behavior if someone carelessly does setlocale(LC_ALL, "") (and we know who you are, perl guys).
2005-11-22Re-run pgindent, fixing a problem where comment lines after a blankBruce Momjian
comment line where output as too long, and update typedefs for /lib directory. Also fix case where identifiers were used as variable names in the backend, but as typedefs in ecpg (favor the backend for indenting). Backpatch to 8.1.X.
2005-10-29Message correctionsPeter Eisentraut
2005-10-22Make code for selecting default WAL sync method less confusing.Tom Lane
2005-10-15Standard pgindent run for 8.1.Bruce Momjian
2005-10-03Expand pg_control information so that we can verify that the databaseTom Lane
was created on a machine with alignment rules and floating-point format similar to the current machine. Per recent discussion, this seems like a good idea with the increasing prevalence of 32/64 bit environments.
2005-08-22Rewrite gather-write patch into something less obviously bolted onTom Lane
after the fact. Fix bug with incorrect test for whether we are at end of logfile segment. Arrange for writes triggered by XLogInsert's is-cache-more-than-half-full test to synchronize with the cache boundaries, so that in long transactions we tend to write alternating halves of the cache rather than randomly chosen portions of it; this saves one more write syscall per cache load.
2005-08-22Fix some inconsistent choices of datatypes in xlog.c. Make bufferTom Lane
indexes all be int, rather than variously int, uint16 and uint32; add some casts where necessary to support large buffer arrays.
2005-08-20Convert the arithmetic for shared memory size calculation from 'int'Tom Lane
to 'Size' (that is, size_t), and install overflow detection checks in it. This allows us to remove the former arbitrary restrictions on NBuffers etc. It won't make any difference in a 32-bit machine, but in a 64-bit machine you could theoretically have terabytes of shared buffers. (How efficiently we could manage 'em remains to be seen.) Similarly, num_temp_buffers, work_mem, and maintenance_work_mem can be set above 2Gb on a 64-bit machine. Original patch from Koichi Suzuki, additional work by moi.
2005-08-11Autovacuum loose end mop-up. Provide autovacuum-specific vacuum costTom Lane
delay and limit, both as global GUCs and as table-specific entries in pg_autovacuum. stats_reset_on_server_start is now OFF by default, but a reset is forced if we did WAL replay. XID-wrap vacuums do not ANALYZE, but do FREEZE if it's a template database. Alvaro Herrera
2005-07-30Fix compile for no O_SYNC, but introduced with O_DIRECT.Bruce Momjian
2005-07-29Clean up a number of autovacuum loose ends. Make the stats collectorTom Lane
track shared relations in a separate hashtable, so that operations done from different databases are counted correctly. Add proper support for anti-XID-wraparound vacuuming, even in databases that are never connected to and so have no stats entries. Miscellaneous other bug fixes. Alvaro Herrera, some additional fixes by Tom Lane.