summaryrefslogtreecommitdiff
path: root/src/backend/access/gin
AgeCommit message (Collapse)Author
2009-05-19Fix bug #4814 (wrong subscript in consistent-function call), and add someTom Lane
minimal regression test coverage for matchPartialInPendingList().
2009-04-05Fix infinite loop while checking of partial match in pending list.Teodor Sigaev
Improve comments. Now GIN-indexable operators should be strict. Per Tom's questions/suggestions.
2009-03-25Adjust the APIs for GIN opclass support functions to allow the extractQuery()Tom Lane
method to pass extra data to the consistent() and comparePartial() methods. This is the core infrastructure needed to support the soon-to-appear contrib/btree_gin module. The APIs are still upward compatible with the definitions used in 8.3 and before, although *not* with the previous 8.4devel function definitions. catversion bump for changes in pg_proc entries (although these are just cosmetic, since GIN doesn't actually look at the function signature before calling it...) Teodor Sigaev and Oleg Bartunov
2009-03-24Install a search tree depth limit in GIN bulk-insert operations, to preventTom Lane
them from degrading badly when the input is sorted or nearly so. In this scenario the tree is unbalanced to the point of becoming a mere linked list, so insertions become O(N^2). The easiest and most safely back-patchable solution is to stop growing the tree sooner, ie limit the growth of N. We might later consider a rebalancing tree algorithm, but it's not clear that the benefit would be worth the cost and complexity. Per report from Sergey Burladyan and an earlier complaint from Heikki. Back-patch to 8.2; older versions didn't have GIN indexes.
2009-03-24Implement "fastupdate" support for GIN indexes, in which we try to accumulateTom Lane
multiple index entries in a holding area before adding them to the main index structure. This helps because bulk insert is (usually) significantly faster than retail insert for GIN. This patch also removes GIN support for amgettuple-style index scans. The API defined for amgettuple is difficult to support with fastupdate, and the previously committed partial-match feature didn't really work with it either. We might eventually figure a way to put back amgettuple support, but it won't happen for 8.4. catversion bumped because of change in GIN's pg_am entry, and because the format of GIN indexes changed on-disk (there's a metapage now, and possibly a pending list). Teodor Sigaev
2009-01-20Add a new option to RestoreBkpBlocks() to indicate if a cleanup lock shouldHeikki Linnakangas
be used instead of the normal exclusive lock, and make WAL redo functions responsible for calling RestoreBkpBlocks(). They know better what kind of a lock they need. At the moment, this just moves things around with no functional change, but makes the hot standby patch that's under review cleaner.
2009-01-10Revise the TIDBitmap API to support multiple concurrent iterations over aTom Lane
bitmap. This is extracted from Greg Stark's posix_fadvise patch; it seems worth committing separately, since it's potentially useful independently of posix_fadvise.
2009-01-05Change the reloptions machinery to use a table-based parser, and provideAlvaro Herrera
a more complete framework for writing custom option processing routines by user-defined access methods. Catalog version bumped due to the general API changes, which are going to affect user-defined "amoptions" routines.
2009-01-01Update copyright for 2009.Bruce Momjian
2008-11-19Rethink the way FSM truncation works. Instead of WAL-logging FSMHeikki Linnakangas
truncations in FSM code, call FreeSpaceMapTruncateRel from smgr_redo. To make that cleaner from modularity point of view, move the WAL-logging one level up to RelationTruncate, and move RelationTruncate and all the related WAL-logging to new src/backend/catalog/storage.c file. Introduce new RelationCreateStorage and RelationDropStorage functions that are used instead of calling smgrcreate/smgrscheduleunlink directly. Move the pending rel deletion stuff from smgrcreate/smgrscheduleunlink to the new functions. This leaves smgr.c as a thin wrapper around md.c; all the transactional stuff is now in storage.c. This will make it easier to add new forks with similar truncation logic, like the visibility map.
2008-11-13Prevent synchronous scan during GIN index build, because GIN is optimizedTom Lane
for inserting tuples in increasing TID order. It's not clear whether this fully explains Ivan Sergio Borgonovo's complaint, but simple testing confirms that a scan that doesn't start at block 0 can slow GIN build by a factor of three or four. Backpatch to 8.3. Sync scan didn't exist before that.
2008-11-03Clean up the messy semantics (not to mention inefficiency) of PageGetTempPageTom Lane
by splitting it into three functions with better-defined behaviors. Zdenek Kotala
2008-10-31Unite ReadBufferWithFork, ReadBufferWithStrategy, and ZeroOrReadBufferHeikki Linnakangas
functions into one ReadBufferExtended function, that takes the strategy and mode as argument. There's three modes, RBM_NORMAL which is the default used by plain ReadBuffer(), RBM_ZERO, which replaces ZeroOrReadBuffer, and a new mode RBM_ZERO_ON_ERROR, which allows callers to read corrupt pages without throwing an error. The FSM needs the new mode to recover from corrupt pages, which could happend if we crash after extending an FSM file, and the new page is "torn". Add fork number to some error messages in bufmgr.c, that still lacked it.
2008-10-20Remove mark/restore support in GIN and GiST indexes.Teodor Sigaev
Per Tom's comment. Also revome useless GISTScanOpaque->flags field.
2008-10-06Index FSMs needs to be vacuumed as well. Report by Jeff Davis.Heikki Linnakangas
2008-09-30Rewrite the FSM. Instead of relying on a fixed-size shared memory segment, theHeikki Linnakangas
free space information is stored in a dedicated FSM relation fork, with each relation (except for hash indexes; they don't use FSM). This eliminates the max_fsm_relations and max_fsm_pages GUC options; remove any trace of them from the backend, initdb, and documentation. Rewrite contrib/pg_freespacemap to match the new FSM implementation. Also introduce a new variant of the get_raw_page(regclass, int4, int4) function in contrib/pageinspect that let's you to return pages from any relation fork, and a new fsm_page_contents() function to inspect the new FSM pages.
2008-09-04Fix strategy propagation to scanEntry for partial match by moving propagationTeodor Sigaev
to initializaion of scanEntry.
2008-07-11Multi-column GIN indexes. Teodor SigaevTom Lane
2008-07-08Minor improvements to the Gin internal documentation.Neil Conway
2008-07-04Fix initialization of GinScanEntryData.partialMatchTeodor Sigaev
2008-06-29Remove unnecessary coziness of GIN code with datum copying. Now thatTom Lane
space is tracked via GetMemoryChunkSpace, there's really no advantage to duplicating datumCopy's innards here. This is one bit of my toast indirection patch that should go in anyway.
2008-06-19Improve our #include situation by moving pointer types away from theAlvaro Herrera
corresponding struct definitions. This allows other headers to avoid including certain highly-loaded headers such as rel.h and relscan.h, instead using just relcache.h, heapam.h or genam.h, which are more lightweight and thus cause less unnecessary dependencies.
2008-06-12Refactor XLogOpenRelation() and XLogReadBuffer() in preparation for relationHeikki Linnakangas
forks. XLogOpenRelation() and the associated light-weight relation cache in xlogutils.c is gone, and XLogReadBuffer() now takes a RelFileNode as argument, instead of Relation. For functions that still need a Relation struct during WAL replay, there's a new function called CreateFakeRelcacheEntry() that returns a fake entry like XLogOpenRelation() used to.
2008-06-08Move BufferGetPageSize and BufferGetPage from bufpage.h to bufmgr.h. It isAlvaro Herrera
more logical that way, and also it reduces the amount of unnecessary includes in bufpage.h, which is widely used. Zdenek Kotala. My previous patch to bufpage.h should also have credited him as author, but I forgot (sorry about that).
2008-05-16Extend GIN to support partial-match searches, and extend tsquery to supportTom Lane
prefix matching using this facility. Teodor Sigaev and Oleg Bartunov
2008-05-16Persuade GIN to react to control-C in a reasonable amount of timeTom Lane
while building a GIN index.
2008-05-12Put back bufmgr.h in bufpage.h -- it is needed by some macros.Alvaro Herrera
Remove #include bufmgr.h from (most?) source files which already include bufpage.h.
2008-05-12Restructure some header files a bit, in particular heapam.h, by removing someAlvaro Herrera
unnecessary #include lines in it. Also, move some tuple routine prototypes and macros to htup.h, which allows removal of heapam.h inclusion from some .c files. For this to work, a new header file access/sysattr.h needed to be created, initially containing attribute numbers of system columns, for pg_dump usage. While at it, make contrib ltree, intarray and hstore header files more consistent with our header style.
2008-04-22Fix using too many LWLocks bug, reported by Craig RingerTeodor Sigaev
<craig@postnewspapers.com.au>. It was my mistake, I missed limitation of number of held locks, now GIN doesn't use continiuous locks, but still hold buffers pinned to prevent interference with vacuum's deletion algorithm. Backpatch is needed.
2008-04-14Push index operator lossiness determination down to GIST/GIN opclassTom Lane
"consistent" functions, and remove pg_amop.opreqcheck, as per recent discussion. The main immediate benefit of this is that we no longer need 8.3's ugly hack of requiring @@@ rather than @@ to test weight-using tsquery searches on GIN indexes. In future it should be possible to optimize some other queries better than is done now, by detecting at runtime whether the index match is exact or not. Tom Lane, after an idea of Heikki's, and with some help from Teodor.
2008-04-13Phase 2 of project to make index operator lossiness be determined at runtimeTom Lane
instead of plan time. Extend the amgettuple API so that the index AM returns a boolean indicating whether the indexquals need to be rechecked, and make that rechecking happen in nodeIndexscan.c (currently the only place where it's expected to be needed; other callers of index_getnext are just erroring out for now). For the moment, GIN and GIST have stub logic that just always sets the recheck flag to TRUE --- I'm hoping to get Teodor to handle pushing that control down to the opclass consistent() functions. The planner no longer pays any attention to amopreqcheck, and that catalog column will go away in due course.
2008-04-10Replace "amgetmulti" AM functions with "amgetbitmap", in which the wholeTom Lane
indexscan always occurs in one call, and the results are returned in a TIDBitmap instead of a limited-size array of TIDs. This should improve speed a little by reducing AM entry/exit overhead, and it is necessary infrastructure if we are ever to support bitmap indexes. In an only slightly related change, add support for TIDBitmaps to preserve (somewhat lossily) the knowledge that particular TIDs reported by an index need to have their quals rechecked when the heap is visited. This facility is not really used yet; we'll need to extend the forced-recheck feature to plain indexscans before it's useful, and that hasn't been coded yet. The intent is to use it to clean up 8.3's horrid @@@ kluge for text search with weighted queries. There might be other uses in future, but that one alone is sufficient reason. Heikki Linnakangas, with some adjustments by me.
2008-03-20Make source code READMEs more consistent. Add CVS tags to all README files.Bruce Momjian
2008-02-19Refactor backend makefiles to remove lots of duplicate codePeter Eisentraut
2008-01-01Update copyrights in source tree to 2008.Bruce Momjian
2007-11-16Improve GIN index build's tracking of memory usage by usingTom Lane
GetMemoryChunkSpace, not just the palloc request size. This brings the allocatedMemory counter close enough to reality (as measured by MemoryContextStats printouts) that I think we can get rid of the arbitrary factor-of-2 adjustment that was put into the code initially. Given the sensitivity of GIN build to work memory size, not using as much of work memory as we're allowed to seems a pretty bad idea.
2007-11-15Re-run pgindent with updated list of typedefs. (Updated README shouldBruce Momjian
avoid this problem in the future.)
2007-11-15pgindent run for 8.3.Bruce Momjian
2007-11-13Clean up some stray references to tsearch2.Tom Lane
2007-10-29- Add check of already changed page while replay WAL. This touches onlyTeodor Sigaev
ginRedoInsert(), because other ginRedo* functions rewrite whole page or make changes which could be applied several times without consistent's loss - Remove check of identifying of corresponding split record: it's possible that replaying of WAL starts after actual page split, but before removing of that split from incomplete splits list. In this case, that check cause FATAL error. Per stress test which reproduces bug reported by Craig McElroy <craig.mcelroy@contegix.com>
2007-10-29Fix coredump during replay WAL after crash. Change entrySplitPage() to preventTeodor Sigaev
usage of any information from system catalog, because it could be called during replay of WAL. Per bug report from Craig McElroy <craig.mcelroy@contegix.com>. Patch doesn't change on-disk storage.
2007-09-20HOT updates. When we update a tuple without changing any of its indexedTom Lane
columns, and the new version can be stored on the same heap page, we no longer generate extra index entries for the new version. Instead, index searches follow the HOT-chain links to ensure they find the correct tuple version. In addition, this patch introduces the ability to "prune" dead tuples on a per-page basis, without having to do a complete VACUUM pass to recover space. VACUUM is still needed to clean up dead index entries, however. Pavan Deolasee, with help from a bunch of other people.
2007-09-14Remove GIN interface section, which is now documented in SGML.Bruce Momjian
Heikki Linnakangas
2007-09-12Redefine the lp_flags field of item pointers as having four states, ratherTom Lane
than two independent bits (one of which was never used in heap pages anyway, or at least hadn't been in a very long time). This gives us flexibility to add the HOT notions of redirected and dead item pointers without requiring anything so klugy as magic values of lp_off and lp_len. The state values are chosen so that for the states currently in use (pre-HOT) there is no change in the physical representation.
2007-08-21Tsearch2 functionality migrates to core. The bulk of this work is byTom Lane
Oleg Bartunov and Teodor Sigaev, but I did a lot of editorializing, so anything that's broken is probably my fault. Documentation is nonexistent as yet, but let's land the patch so we can get some portability testing done.
2007-06-05Move call of MarkBufferDirty() before XLogInsert() as required.Teodor Sigaev
Many thanks to Heikki Linnakangas <heikki@enterprisedb.com> for his sharp eyes.
2007-06-04Fix bundle bugs of GIN:Teodor Sigaev
- Fix possible deadlock between UPDATE and VACUUM queries. Bug never was observed in 8.2, but it still exist there. HEAD is more sensitive to bug after recent "ring" of buffer improvements. - Fix WAL creation: if parent page is stored as is after split then incomplete split isn't removed during replay. This happens rather rare, only on large tables with a lot of updates/inserts. - Fix WAL replay: there was wrong test of XLR_BKP_BLOCK_* for left page after deletion of page. That causes wrong rightlink field: it pointed to deleted page. - add checking of match of clearing incomplete split - cleanup incomplete split list after proceeding All of this chages doesn't change on-disk storage, so backpatch... But second point may be an issue for replaying logs from previous version.
2007-05-31Replace ReadBuffer to ReadBufferWithStrategy in all vacuum-involved placesTeodor Sigaev
to implement limited-size "ring" of buffers for VACUUM for GIN & GIST
2007-05-27Fix up pgstats counting of live and dead tuples to recognize that committedTom Lane
and aborted transactions have different effects; also teach it not to assume that prepared transactions are always committed. Along the way, simplify the pgstats API by tying counting directly to Relations; I cannot detect any redeeming social value in having stats pointers in HeapScanDesc and IndexScanDesc structures. And fix a few corner cases in which counts might be missed because the relation's pgstat_info pointer hadn't been set.
2007-02-01Fix a few typos in comments in GiN.Neil Conway