summaryrefslogtreecommitdiff
path: root/src/include/access
AgeCommit message (Collapse)Author
2006-04-24Suppress more compiler warnings caused by macro tests.Bruce Momjian
2006-04-24Add one more paren to macro.Bruce Momjian
2006-04-24Suprress compiler warning in gcc 4.2.Bruce Momjian
Report by Kris Jurka
2006-04-14Make the world safe for full_page_writes. Allow XLOG records that try toTom Lane
update no-longer-existing pages to fall through as no-ops, but make a note of each page number referenced by such records. If we don't see a later XLOG entry dropping the table or truncating away the page, complain at the end of XLOG replay. Since this fixes the known failure mode for full_page_writes = off, revert my previous band-aid patch that disabled that GUC variable.
2006-04-13Fix an ancient oversight in btree xlog replay. When trying to determine if anTom Lane
upper-level insertion completes a previously-seen split, we cannot simply grab the downlink block number out of the buffer, because the buffer could contain a later state of the page --- or perhaps the page doesn't even exist at all any more, due to relation truncation. These possibilities have been masked up to now because the use of full_page_writes effectively ensured that no xlog replay routine ever actually saw a page state newer than its own change. Since we're deprecating full_page_writes in 8.1.*, there's no need to fix this in existing release branches, but we need a fix in HEAD if we want to have any hope of re-allowing full_page_writes. Accordingly, adjust the contents of btree WAL records so that we can always get the downlink block number from the WAL record rather than having to depend on buffer contents. Per report from Kevin Grittner and Peter Brant. Improve a few comments in related code while at it.
2006-04-05Add a field to the first page of each WAL file to indicate theTom Lane
XLOG_BLCKSZ. This ought to help in preventing configuration mismatch problems if anyone tries to ship PITR files between servers compiled with different XLOG_BLCKSZ settings. Simon Riggs
2006-04-03Define a separately configurable XLOG_BLCKSZ symbol for the page sizeTom Lane
used within WAL files. Historically this was the same as the data file BLCKSZ, but there's no necessary connection, and it's possible that performance gains might ensue from reducing XLOG_BLCKSZ. In any case distinguishing two symbols should improve code clarity. This commit does not actually change the page size, only provide the infrastructure to make it possible to do so. initdb forced because of addition of a field to pg_control. Mark Wong, with some help from Simon Riggs and Tom Lane.
2006-04-03Eliminate ajust scan code. Since concurrent GiST it doesn'tTeodor Sigaev
do real work. That was missed during concurrence development.
2006-04-01Remove the 'slow' path for btree index build, which built the btreeTom Lane
incrementally by successive inserts rather than by sorting the data. We were only using the slow path during bootstrap, apparently because when first written it failed during bootstrap --- but it works fine now AFAICT. Removing it saves a hundred or so lines of code and produces noticeably (~10%) smaller initial states of the system catalog indexes. While that won't make much difference for heavily-modified catalogs, for the more static ones there may be a useful long-term performance improvement.
2006-03-31Clean up WAL/buffer interactions as per my recent proposal. Get rid of theTom Lane
misleadingly-named WriteBuffer routine, and instead require routines that change buffer pages to call MarkBufferDirty (which does exactly what it says). We also require that they do so before calling XLogInsert; this takes care of the synchronization requirement documented in SyncOneBuffer. Note that because bufmgr takes the buffer content lock (in shared mode) while writing out any buffer, it doesn't matter whether MarkBufferDirty is executed before the buffer content change is complete, so long as the content change is completed before releasing exclusive lock on the buffer. So it's OK to set the dirtybit before we fill in the LSN. This eliminates the former kluge of needing to set the dirtybit in LockBuffer. Aside from making the code more transparent, we can also add some new debugging assertions, in particular that the caller of MarkBufferDirty must hold the buffer content lock, not merely a pin.
2006-03-30Improve gist XLOG code to follow the coding rules needed to preventTom Lane
torn-page problems. This introduces some issues of its own, mainly that there are now some critical sections of unreasonably broad scope, but it's a step forward anyway. Further cleanup will require some code refactoring that I'd prefer to get Oleg and Teodor involved in.
2006-03-29Clean up and document the API for XLogOpenRelation and XLogReadBuffer.Tom Lane
This commit doesn't make much functional change, but it does eliminate some duplicated code --- for instance, PageIsNew tests are now done inside XLogReadBuffer rather than by each caller. The GIST xlog code still needs a lot of love, but I'll worry about that separately.
2006-03-24Arrange to emit a description of the current XLOG record as error contextTom Lane
when an error occurs during xlog replay. Also, replace the former risky 'write into a fixed-size buffer with no overflow detection' API for XLOG record description routines; use an expansible StringInfo instead. (The latter accounts for most of the patch bulk.) Qingqing Zhou
2006-03-16Clean up representation of function RTEs for functions returning RECORD.Tom Lane
The original coding stored the raw parser output (ColumnDef and TypeName nodes) which was ugly, bulky, and wrong because it failed to create any dependency on the referenced datatype --- and in fact would not track type renamings and suchlike. Instead store a list of column type OIDs in the RTE. Also fix up general failure of recordDependencyOnExpr to do anything sane about recording dependencies on datatypes. While there are many cases where there will be an indirect dependency (eg if an operator returns a datatype, the dependency on the operator is enough), we do have to record the datatype as a separate dependency in examples like CoerceToDomain. initdb forced because of change of stored rules.
2006-03-05Update copyright for 2006. Update scripts.Bruce Momjian
2006-02-11Skip ambulkdelete scan if there's nothing to delete and the index is notTom Lane
partial. None of the existing AMs do anything useful except counting tuples when there's nothing to delete, and we can get a tuple count from the heap as long as it's not a partial index. (hash actually can skip anyway because it maintains a tuple count in the index metapage.) GIST is not currently able to exploit this optimization because, due to failure to index NULLs, GIST is always effectively partial. Possibly we should fix that sometime. Simon Riggs w/ some review by Tom Lane.
2006-02-11Revert based on Tom's recommendation:Bruce Momjian
> Allow VACUUM to complete faster by avoiding scanning the indexes when no > rows were removed from the heap by the VACUUM.
2006-02-11Allow VACUUM to complete faster by avoiding scanning the indexes when noBruce Momjian
rows were removed from the heap by the VACUUM. Simon Riggs
2006-01-25Remove the no-longer-useful HashItem/HashItemData level of structure.Tom Lane
Same motivation as for BTItem.
2006-01-25Remove the no-longer-useful BTItem/BTItemData level of structure, andTom Lane
just refer to btree index entries as plain IndexTuples, which is what they have been for a very long time. This is mostly just an exercise in removing extraneous notation, but it does save a palloc/pfree cycle per index insertion.
2006-01-25Allow row comparisons to be used as indexscan qualifications.Tom Lane
This completes the project to upgrade our handling of row comparisons.
2006-01-23Instead of using a numberOfRequiredKeys count to distinguish requiredTom Lane
and non-required keys in a btree index scan, mark the required scankeys with private flag bits SK_BT_REQFWD and/or SK_BT_REQBKWD. This seems at least marginally clearer to me, and it eliminates a wired-into-the- data-structure assumption that required keys are consecutive. Even though that assumption will remain true for the foreseeable future, having it in there makes the code seem more complex than necessary.
2006-01-14Some minor code cleanup, falling out from the removal of rtree. SK_NEGATETom Lane
isn't being used anywhere anymore, and there seems no point in a generic index_keytest() routine when two out of three remaining access methods aren't using it. Also, add a comment documenting a convention for letting access methods define private flag bits in ScanKey sk_flags. There are no such flags at the moment but I'm thinking about changing btree's handling of "required keys" to use flag bits in the keys rather than a count of required key positions. Also, if some AM did still want SK_NEGATE then it would be reasonable to treat it as a private flag bit.
2005-12-07Push the responsibility for handling ignore_killed_tuples down intoTom Lane
_bt_checkkeys(), instead of checking it in the top-level nbtree.c routines as formerly. This saves a little bit of loop overhead, but more importantly it lets us skip performing the index key comparisons for dead tuples.
2005-12-06Get rid of slru.c's hardwired insistence on a fixed number of slots perTom Lane
SLRU area. The number of slots is still a compile-time constant (someday we might want to change that), but at least it's a different constant for each SLRU area. Increase number of subtrans buffers to 32 based on experimentation with a heavily subtrans-bashing test case, and increase number of multixact member buffers to 16, since it's obviously silly for it not to be at least twice the number of multixact offset buffers.
2005-12-06Arrange for read-only accesses to SLRU page buffers to take only a sharedTom Lane
lock, not exclusive, if the desired page is already in memory. This can be demonstrated to be a significant win on the pg_subtrans cache when there is a large window of open transactions. It should be useful for pg_clog as well. I didn't try to make GetMultiXactIdMembers() use the code, as that would have taken some restructuring, and what with the local cache for multixact contents it probably wouldn't really make a difference. Per my recent proposal.
2005-12-03Tweak indexscan machinery to avoid taking an AccessShareLock on an indexTom Lane
if we already have a stronger lock due to the index's table being the update target table of the query. Same optimization I applied earlier at the table level. There doesn't seem to be much interest in the more radical idea of not locking indexes at all, so do what we can ...
2005-11-26Change seqscan logic so that we check visibility of all tuples on a pageTom Lane
when we first read the page, rather than checking them one at a time. This allows us to take and release the buffer content lock just once per page, instead of once per tuple. Since it's a shared lock the contention penalty for holding the lock longer shouldn't be too bad. We can safely do this only when using an MVCC snapshot; else the assumption that visibility won't change over time is uncool. Therefore there are now two code paths depending on the snapshot type. I also made the same change in nodeBitmapHeapscan.c, where it can be done always because we only support MVCC snapshots for bitmap scans anyway. Also make some incidental cleanups in the APIs of these functions. Per a suggestion from Qingqing Zhou.
2005-11-22Re-run pgindent, fixing a problem where comment lines after a blankBruce Momjian
comment line where output as too long, and update typedefs for /lib directory. Also fix case where identifiers were used as variable names in the backend, but as typedefs in ecpg (favor the backend for indenting). Backpatch to 8.1.X.
2005-11-20Remove the t_datamcxt field of HeapTupleData. This was introduced forTom Lane
the convenience of tuptoaster.c and is no longer needed, so may as well get rid of some small amount of overhead.
2005-11-20Modify tuptoaster's API so that it does not try to modify the passedTom Lane
tuple in-place, but instead passes back an all-new tuple structure if any changes are needed. This is a much cleaner and more robust solution for the bug discovered by Alexey Beschiokov; accordingly, revert the quick hack I installed yesterday. With this change, HeapTupleData.t_datamcxt is no longer needed; will remove it in a separate commit in HEAD only.
2005-11-07R-tree is dead ... long live GiST.Tom Lane
2005-11-06Add simple sanity checks on newly-read pages to GiST, too.Tom Lane
2005-11-06Add defenses to btree and hash index AMs to do simple sanity checksTom Lane
on every index page they read; in particular to catch the case of an all-zero page, which PageHeaderIsValid allows to pass. It turns out hash already had this idea, but it was just Assert()ing things rather than doing a straight error check, and the Asserts were partially redundant with PageHeaderIsValid anyway. Per recent failure example from Jim Nasby. (gist still needs the same treatment.)
2005-11-05Clean up representation of SLRU page state. This is the cleaner fixTom Lane
for the SLRU race condition that I posted a few days ago, but we decided not to use in 8.1 and older branches.
2005-10-15Standard pgindent run for 8.1.Bruce Momjian
2005-10-07Remove an unused typedef.Alvaro Herrera
2005-09-02Clean up a couple of ad-hoc computations of the maximum number of tuplesTom Lane
on a page, as suggested by ITAGAKI Takahiro. Also, change a few places that were using some other estimates of max-items-per-page to consistently use MaxOffsetNumber. This is conservatively large --- we could have used the new MaxHeapTuplesPerPage macro, or a similar one for index tuples --- but those places are simply declaring a fixed-size buffer and assuming it will work, rather than actively testing for overrun. It seems safer to size these buffers in a way that can't overflow even if the page is corrupt.
2005-08-20Convert the arithmetic for shared memory size calculation from 'int'Tom Lane
to 'Size' (that is, size_t), and install overflow detection checks in it. This allows us to remove the former arbitrary restrictions on NBuffers etc. It won't make any difference in a 32-bit machine, but in a 64-bit machine you could theoretically have terabytes of shared buffers. (How efficiently we could manage 'em remains to be seen.) Similarly, num_temp_buffers, work_mem, and maintenance_work_mem can be set above 2Gb on a 64-bit machine. Original patch from Koichi Suzuki, additional work by moi.
2005-08-20Make GetMultiXactIdMembers() a public function.Tatsuo Ishii
2005-08-20Repair problems with VACUUM destroying t_ctid chains too soon, and withTom Lane
insufficient paranoia in code that follows t_ctid links. (We must do both because even with VACUUM doing it properly, the intermediate state with a dangling t_ctid link is visible concurrently during lazy VACUUM, and could be seen afterwards if either type of VACUUM crashes partway through.) Also try to improve documentation about what's going on. Patch is a bit bulky because passing the XMAX information around required changing the APIs of some low-level heapam.c routines, but it's not conceptually very complicated. Per trouble report from Teodor and subsequent analysis. This needs to be back-patched, but I'll do that after 8.1 beta is out.
2005-08-12Solve the problem of OID collisions by probing for duplicate OIDsTom Lane
whenever we generate a new OID. This prevents occasional duplicate-OID errors that can otherwise occur once the OID counter has wrapped around. Duplicate relfilenode values are also checked for when creating new physical files. Per my recent proposal.
2005-08-01Add NOWAIT option to SELECT FOR UPDATE/SHARE.Tom Lane
Original patch by Hans-Juergen Schoenig, revisions by Karel Zak and Tom Lane.
2005-07-29Clean up a number of autovacuum loose ends. Make the stats collectorTom Lane
track shared relations in a separate hashtable, so that operations done from different databases are counted correctly. Add proper support for anti-XID-wraparound vacuuming, even in databases that are never connected to and so have no stats entries. Miscellaneous other bug fixes. Alvaro Herrera, some additional fixes by Tom Lane.
2005-07-06Add pg_column_size() to return storage size of a column, includingBruce Momjian
possible compression. Mark Kirkwood
2005-07-04Arrange for the postmaster (and standalone backends, initdb, etc) toTom Lane
chdir into PGDATA and subsequently use relative paths instead of absolute paths to access all files under PGDATA. This seems to give a small performance improvement, and it should make the system more robust against naive DBAs doing things like moving a database directory that has a live postmaster in it. Per recent discussion.
2005-06-30Bug fixes for GiST crash recovery.Teodor Sigaev
- add forgotten check of lsn for insert completion - remove level of pages: hard to check in recovery - some cleanups
2005-06-29Clean up the rather historically encumbered interface to now() andTom Lane
current time: provide a GetCurrentTimestamp() function that returns current time in the form of a TimestampTz, instead of separate time_t and microseconds fields. This is what all the callers really want anyway, and it eliminates low-level dependencies on AbsoluteTime, which is a deprecated datatype that will have to disappear eventually.
2005-06-28Replace pg_shadow and pg_group by new role-capable catalogs pg_authidTom Lane
and pg_auth_members. There are still many loose ends to finish in this patch (no documentation, no regression tests, no pg_dump support for instance). But I'm going to commit it now anyway so that Alvaro can make some progress on shared dependencies. The catalog changes should be pretty much done.
2005-06-27Concurrency for GiSTTeodor Sigaev
- full concurrency for insert/update/select/vacuum: - select and vacuum never locks more than one page simultaneously - select (gettuple) hasn't any lock across it's calls - insert never locks more than two page simultaneously: - during search of leaf to insert it locks only one page simultaneously - while walk upward to the root it locked only parent (may be non-direct parent) and child. One of them X-lock, another may be S- or X-lock - 'vacuum full' locks index - improve gistgetmulti - simplify XLOG records Fix bug in index_beginscan_internal: LockRelation may clean rd_aminfo structure, so move GET_REL_PROCEDURE after LockRelation