summaryrefslogtreecommitdiff
path: root/src/backend/access/hash/README
AgeCommit message (Collapse)Author
2025-12-09Doc: fix typo in hash index documentationDavid Rowley
Plus a similar fix to the README. Backpatch as far back as the sgml issue exists. The README issue does exist in v14, but that seems unlikely to harm anyone. Author: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/ed3db7ea-55b4-4809-86af-81ad3bb2c7d3@gmail.com Backpatch-through: 15
2022-05-31Fix typo in hash README.Amit Kapila
Author: Peter Smith Discussion: https://postgr.es/m/CAHut+Pu-V22PiJF2ym9_NVZe-+qnycfyEX24dZm=7URWhDHJ3w@mail.gmail.com
2021-08-12Fix grammar mistake in hash index READMEJohn Naylor
Dilip Kumar Discussion: https://www.postgresql.org/message-id/CAFiTN-tjZbuY6vy7kZZ6xO%2BD4mVcO5wOPB5KiwJ3AHhpytd8fg%40mail.gmail.com
2018-08-21fix typoAlvaro Herrera
2018-03-29README change: update for hash access methodBruce Momjian
Reported-by: Thomas Munro, Justin Pryzby Discussion: https://postgr.es/m/CAEepm=1_682z-09DNHj4GkCJAqWK-D6h9Oq5ea84T1oqq1-Utg@mail.gmail.com
2017-09-26Fix trivial mistake in README.Robert Haas
You might think I (Robert) could manage to count to five without messing it up, but if you did, you would be wrong. Amit Kapila Discussion: http://postgr.es/m/CAA4eK1JxqqcuC5Un7YLQVhOYSZBS+t=3xqZuEkt5RyquyuxpwQ@mail.gmail.com
2017-09-22hash: Implement page-at-a-time scan.Robert Haas
Commit 09cb5c0e7d6fbc9dee26dc429e4fc0f2a88e5272 added a similar optimization to btree back in 2006, but nobody bothered to implement the same thing for hash indexes, probably because they weren't WAL-logged and had lots of other performance problems as well. As with the corresponding btree case, this eliminates the problem of potentially needing to refind our position within the page, and cuts down on pin/unpin traffic as well. Ashutosh Sharma, reviewed by Alexander Korotkov, Jesper Pedersen, Amit Kapila, and me. Some final edits to comments and README by me. Discussion: http://postgr.es/m/CAE9k0Pm3KTx93K8_5j6VMzG4h5F+SyknxUwXrN-zqSZ9X8ZS3w@mail.gmail.com
2017-04-03Expand hash indexes more gradually.Robert Haas
Since hash indexes typically have very few overflow pages, adding a new splitpoint essentially doubles the on-disk size of the index, which can lead to large and abrupt increases in disk usage (and perhaps long delays on occasion). To mitigate this problem to some degree, divide larger splitpoints into four equal phases. This means that, for example, instead of growing from 4GB to 8GB all at once, a hash index will now grow from 4GB to 5GB to 6GB to 7GB to 8GB, which is perhaps still not as smooth as we'd like but certainly an improvement. This changes the on-disk format of the metapage, so bump HASH_VERSION from 2 to 3. This will force a REINDEX of all existing hash indexes, but that's probably a good idea anyway. First, hash indexes from pre-10 versions of PostgreSQL could easily be corrupted, and we don't want to confuse corruption carried over from an older release with any corruption caused despite the new write-ahead logging in v10. Second, it will let us remove some backward-compatibility code added by commit 293e24e507838733aba4748b514536af2d39d7f2. Mithun Cy, reviewed by Amit Kapila, Jesper Pedersen and me. Regression test outputs updated by me. Discussion: http://postgr.es/m/CAD__OuhG6F1gQLCgMQNnMNgoCvOLQZz9zKYJQNYvYmmJoM42gA@mail.gmail.com Discussion: http://postgr.es/m/CA+TgmoYty0jCf-pa+m+vYUJ716+AxM7nv_syvyanyf5O-L_i2A@mail.gmail.com
2017-03-15Port single-page btree vacuum logic to hash indexes.Robert Haas
This is advantageous for hash indexes for the same reasons it's good for btrees: it accelerates space recycling, reducing bloat. Ashutosh Sharma, reviewed by Amit Kapila and by me. A bit of additional hacking by me. Discussion: http://postgr.es/m/CAE9k0PkRSyzx8dOnokEpUi2A-RFZK72WN0h9DEMv_ut9q6bPRw@mail.gmail.com
2017-03-15Cosmetic fixes for hash index write-ahead logging.Robert Haas
Amit Kapila. One of these was reported by Tom Lane. Discussion: http://postgr.es/m/5515.1489514099@sss.pgh.pa.us
2017-03-14hash: Add write-ahead logging support.Robert Haas
The warning about hash indexes not being write-ahead logged and their use being discouraged has been removed. "snapshot too old" is now supported for tables with hash indexes. Most importantly, barring bugs, hash indexes will now be crash-safe and usable on standbys. This commit doesn't yet add WAL consistency checking for hash indexes, as we now have for other index types; a separate patch has been submitted to cure that lack. Amit Kapila, reviewed and slightly modified by me. The larger patch series of which this is a part has been reviewed and tested by Álvaro Herrera, Ashutosh Sharma, Mark Kirkwood, Jeff Janes, and Jesper Pedersen. Discussion: http://postgr.es/m/CAA4eK1JOBX=YU33631Qh-XivYXtPSALh514+jR8XeD7v+K3r_Q@mail.gmail.com
2017-02-07Cache hash index's metapage in rel->rd_amcache.Robert Haas
This avoids a very significant amount of buffer manager traffic and contention when scanning hash indexes, because it's no longer necessary to lock and pin the metapage for every scan. We do need some way of figuring out when the cache is too stale to use any more, so that when we lock the primary bucket page to which the cached metapage points us, we can tell whether a split has occurred since we cached the metapage data. To do that, we use the hash_prevblkno field in the primary bucket page, which would otherwise always be set to InvalidBuffer. This patch contains code so that it will continue working (although less efficiently) with hash indexes built before this change, but perhaps we should consider bumping the hash version and ripping out the compatibility code. That decision can be made later, though. Mithun Cy, reviewed by Jesper Pedersen, Amit Kapila, and by me. Before committing, I made a number of cosmetic changes to the last posted version of the patch, adjusted _hash_getcachedmetap to be more careful about order of operation, and made some necessary updates to the pageinspect documentation and regression tests.
2016-11-30Improve hash index bucket split behavior.Robert Haas
Previously, the right to split a bucket was represented by a heavyweight lock on the page number of the primary bucket page. Unfortunately, this meant that every scan needed to take a heavyweight lock on that bucket also, which was bad for concurrency. Instead, use a cleanup lock on the primary bucket page to indicate the right to begin a split, so that scans only need to retain a pin on that page, which is they would have to acquire anyway, and which is also much cheaper. In addition to reducing the locking cost, this also avoids locking out scans and inserts for the entire lifetime of the split: while the new bucket is being populated with copies of the appropriate tuples from the old bucket, scans and inserts can happen in parallel. There are minor concurrency improvements for vacuum operations as well, though the situation there is still far from ideal. This patch also removes the unworldly assumption that a split will never be interrupted. With the new code, a split is done in a series of small steps and the system can pick up where it left off if it is interrupted prior to completion. While this patch does not itself add write-ahead logging for hash indexes, it is clearly a necessary first step, since one of the things that could interrupt a split is the removal of electrical power from the machine performing it. Amit Kapila. I wrote the original design on which this patch is based, and did a good bit of work on the comments and README through multiple rounds of review, but all of the code is Amit's. Also reviewed by Jesper Pedersen, Jeff Janes, and others. Discussion: http://postgr.es/m/CAA4eK1LfzcZYxLoXS874Ad0+S-ZM60U9bwcyiUZx9mHZ-KCWhw@mail.gmail.com
2013-06-03Additional spelling correctionsStephen Frost
A few more minor spelling corrections, no functional changes. Thom Brown
2012-06-26Reduce use of heavyweight locking inside hash AM.Robert Haas
Avoid using LockPage(rel, 0, lockmode) to protect against changes to the bucket mapping. Instead, an exclusive buffer content lock is now viewed as sufficient permission to modify the metapage, and a shared buffer content lock is used when such modifications need to be prevented. This more relaxed locking regimen makes it possible that, when we're busy getting a heavyweight bucket on the bucket we intend to search or insert into, a bucket split might occur underneath us. To compenate for that possibility, we use a loop-and-retry system: release the metapage content lock, acquire the heavyweight lock on the target bucket, and then reacquire the metapage content lock and check that the bucket mapping has not changed. Normally it hasn't, and we're done. But if by chance it has, we simply unlock the metapage, release the heavyweight lock we acquired previously, lock the new bucket, and loop around again. Even in the worst case we cannot loop very many times here, since we don't split the same bucket again until we've split all the other buckets, and 2^N gets big pretty fast. This results in greatly improved concurrency, because we're effectively replacing two lwlock acquire-and-release cycles in exclusive mode (on one of the lock manager locks) with a single acquire-and-release cycle in shared mode (on the metapage buffer content lock). Testing shows that it's still not quite as good as btree; for that, we'd probably have to find some way of getting rid of the heavyweight bucket locks as well, which does not appear straightforward. Patch by me, review by Jeff Janes.
2010-09-20Remove cvs keywords from all files.Magnus Hagander
2009-11-01Fix two serious bugs introduced into hash indexes by the 8.4 patch that madeTom Lane
hash indexes keep entries sorted by hash value. First, the original plans for concurrency assumed that insertions would happen only at the end of a page, which is no longer true; this could cause scans to transiently fail to find index entries in the presence of concurrent insertions. We can compensate by teaching scans to re-find their position after re-acquiring read locks. Second, neither the bucket split nor the bucket compaction logic had been fixed to preserve hashvalue ordering, so application of either of those processes could lead to permanent corruption of an index, in the sense that searches might fail to find entries that are present. This patch fixes the split and compaction logic to preserve hashvalue ordering, but it cannot do anything about pre-existing corruption. We will need to recommend reindexing all hash indexes in the 8.4.2 release notes. To buy back the performance loss hereby induced in split and compaction, fix them to use PageIndexMultiDelete instead of retail PageIndexDelete operations. We might later want to do something with qsort'ing the page contents rather than doing a binary search for each insertion, but that seemed more invasive than I cared to risk in a back-patch. Per bug #5157 from Jeff Janes and subsequent investigation.
2008-03-20Make source code READMEs more consistent. Add CVS tags to all README files.Bruce Momjian
2008-03-15Change hash index creation so that rather than always establishing exactlyTom Lane
two buckets at the start, we create a number of buckets appropriate for the estimated size of the table. This avoids a lot of expensive bucket-split actions during initial index build on an already-populated table. This is one of the two core ideas of Tom Raney and Shreya Bhargava's patch to reduce hash index build time. I'm committing it separately to make it easier for people to test the effects of this separately from the effects of their other core idea (pre-sorting the index entries by bucket number).
2007-04-19Repair PANIC condition in hash indexes when a previous index extension attemptTom Lane
failed (due to lock conflicts or out-of-space). We might have already extended the index's filesystem EOF before failing, causing the EOF to be beyond what the metapage says is the last used page. Hence the invariant maintained by the code needs to be "EOF is at or beyond last used page", not "EOF is exactly the last used page". Problem was created by my patch of 2006-11-19 that attempted to repair bug #2737. Since that was back-patched to 7.4, this needs to be as well. Per report and test case from Vlastimil Krejcir.
2007-01-09Add a citation to Seltzer and Yigit's Usenix '91 paper about hash tableTom Lane
management. The paper clearly describes many of the ideas embodied in our current hashing code, but as far as I could find out there is not a direct code heritage. (Mike Olsen recalls discussion of this paper at Postgres meetings but believes it "informed the Postgres implementation probably just at the design level". Margo herself says she wasn't involved with Postgres' hash code.) Credit where credit is due 'n all that, even if fifteen years after the fact.
2003-11-29$Header: -> $PostgreSQL Changes ...PostgreSQL Daemon
2003-09-04Reimplement hash index locking algorithms, per my recent proposal toTom Lane
pghackers. This fixes the problem recently reported by Markus KrÌutner (hash bucket split corrupts the state of scans being done concurrently), and I believe it also fixes all the known problems with deadlocks in hash index operations. Hash indexes are still not really ready for prime time (since they aren't WAL-logged), but this is a step forward.
2003-09-02Fix a couple typos, add some more comments.Tom Lane
2003-09-01Add some internals documentation for hash indexes, including anTom Lane
explanation of the remarkably confusing page addressing scheme. The file also includes my planned-but-not-yet-implemented revision of the hash index locking scheme.