summaryrefslogtreecommitdiff
path: root/src/backend/executor/nodeIndexonlyscan.c
AgeCommit message (Collapse)Author
2019-06-28Fix misleading comment in nodeIndexonlyscan.c.Thomas Munro
The stated reason for acquiring predicate locks on heap pages hasn't existed since commit c01262a8, so fix the comment. Perhaps in a later release we'll also be able to change the code to use tuple locks. Back-patch all the way. Reviewed-by: Ashwin Agrawal Discussion: https://postgr.es/m/CAEepm%3D2GK3FVdnt5V3d%2Bh9njWipCv_fNL%3DwjxyUhzsF%3D0PcbNg%40mail.gmail.com
2016-03-01Change the format of the VM fork to add a second bit per page.Robert Haas
The new bit indicates whether every tuple on the page is already frozen. It is cleared only when the all-visible bit is cleared, and it can be set only when we vacuum a page and find that every tuple on that page is both visible to every transaction and in no need of any future vacuuming. A future commit will use this new bit to optimize away full-table scans that would otherwise be triggered by XID wraparound considerations. A page which is merely all-visible must still be scanned in that case, but a page which is all-frozen need not be. This commit does not attempt that optimization, although that optimization is the goal here. It seems better to get the basic infrastructure in place first. Per discussion, it's very desirable for pg_upgrade to automatically migrate existing VM forks from the old format to the new format. That, too, will be handled in a follow-on patch. Masahiko Sawada, reviewed by Kyotaro Horiguchi, Fujii Masao, Amit Kapila, Simon Riggs, Andres Freund, and others, and substantially revised by me.
2016-01-02Update copyright for 2016Bruce Momjian
Backpatch certain files through 9.1
2015-05-23pgindent run for 9.5Bruce Momjian
2015-05-23Add error check for lossy distance functions in index-only scans.Tom Lane
Maybe we should actually support this, but for the moment let's just throw an error if the opclass tries it.
2015-05-10Code review for foreign/custom join pushdown patch.Tom Lane
Commit e7cb7ee14555cc9c5773e2c102efd6371f6f2005 included some design decisions that seem pretty questionable to me, and there was quite a lot of stuff not to like about the documentation and comments. Clean up as follows: * Consider foreign joins only between foreign tables on the same server, rather than between any two foreign tables with the same underlying FDW handler function. In most if not all cases, the FDW would simply have had to apply the same-server restriction itself (far more expensively, both for lack of caching and because it would be repeated for each combination of input sub-joins), or else risk nasty bugs. Anyone who's really intent on doing something outside this restriction can always use the set_join_pathlist_hook. * Rename fdw_ps_tlist/custom_ps_tlist to fdw_scan_tlist/custom_scan_tlist to better reflect what they're for, and allow these custom scan tlists to be used even for base relations. * Change make_foreignscan() API to include passing the fdw_scan_tlist value, since the FDW is required to set that. Backwards compatibility doesn't seem like an adequate reason to expect FDWs to set it in some ad-hoc extra step, and anyway existing FDWs can just pass NIL. * Change the API of path-generating subroutines of add_paths_to_joinrel, and in particular that of GetForeignJoinPaths and set_join_pathlist_hook, so that various less-used parameters are passed in a struct rather than as separate parameter-list entries. The objective here is to reduce the probability that future additions to those parameter lists will result in source-level API breaks for users of these hooks. It's possible that this is even a small win for the core code, since most CPU architectures can't pass more than half a dozen parameters efficiently anyway. I kept root, joinrel, outerrel, innerrel, and jointype as separate parameters to reduce code churn in joinpath.c --- in particular, putting jointype into the struct would have been problematic because of the subroutines' habit of changing their local copies of that variable. * Avoid ad-hocery in ExecAssignScanProjectionInfo. It was probably all right for it to know about IndexOnlyScan, but if the list is to grow we should refactor the knowledge out to the callers. * Restore nodeForeignscan.c's previous use of the relcache to avoid extra GetFdwRoutine lookups for base-relation scans. * Lots of cleanup of documentation and missed comments. Re-order some code additions into more logical places.
2015-01-06Update copyright for 2015Bruce Momjian
Backpatch certain files through 9.0
2014-05-06Improve comment for tricky aspect of index-only scans.Jeff Davis
Index-only scans avoid taking a lock on the VM buffer, which would cause a lot of contention. To be correct, that requires some intricate assumptions that weren't completely documented in the previous comment. Reviewed by Robert Haas.
2014-05-06pgindent run for 9.4Bruce Momjian
This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.
2014-01-07Update copyright for 2014Bruce Momjian
Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.
2013-04-27Incidental cleanup of matviews code.Tom Lane
Move checking for unscannable matviews into ExecOpenScanRelation, which is a better place for it first because the open relation is already available (saving a relcache lookup cycle), and second because this eliminates the problem of telling the difference between rangetable entries that will or will not be scanned by the query. In particular we can get rid of the not-terribly-well-thought-out-or-implemented isResultRel field that the initial matviews patch added to RangeTblEntry. Also get rid of entirely unnecessary scannability check in the rewriter, and a bogus decision about whether RefreshMatViewStmt requires a parse-time snapshot. catversion bump due to removal of a RangeTblEntry field, which changes stored rules.
2013-01-01Update copyrights for 2013Bruce Momjian
Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.
2012-09-04Fix serializable mode with index-only scans.Kevin Grittner
Serializable Snapshot Isolation used for serializable transactions depends on acquiring SIRead locks on all heap relation tuples which are used to generate the query result, so that a later delete or update of any of the tuples can flag a read-write conflict between transactions. This is normally handled in heapam.c, with tuple level locking. Since an index-only scan avoids heap access in many cases, building the result from the index tuple, the necessary predicate locks were not being acquired for all tuples in an index-only scan. To prevent problems with tuple IDs which are vacuumed and re-used while the transaction still matters, the xmin of the tuple is part of the tag for the tuple lock. Since xmin is not available to the index-only scan for result rows generated from the index tuples, it is not possible to acquire a tuple-level predicate lock in such cases, in spite of having the tid. If we went to the heap to get the xmin value, it would no longer be an index-only scan. Rather than prohibit index-only scans under serializable transaction isolation, we acquire an SIRead lock on the page containing the tuple, when it was not necessary to visit the heap for other reasons. Backpatch to 9.2. Kevin Grittner and Tom Lane
2012-06-10Run pgindent on 9.2 source tree in preparation for first 9.3Bruce Momjian
commit-fest.
2012-06-07Fix more crash-safe visibility map bugs, and improve comments.Robert Haas
In lazy_scan_heap, we could issue bogus warnings about incorrect information in the visibility map, because we checked the visibility map bit before locking the heap page, creating a race condition. Fix by rechecking the visibility map bit before we complain. Rejigger some related logic so that we rely on the possibly-outdated all_visible_according_to_vm value as little as possible. In heap_multi_insert, it's not safe to clear the visibility map bit before beginning the critical section. The visibility map is not crash-safe unless we treat clearing the bit as a critical operation. Specifically, if the transaction were to error out after we set the bit and before entering the critical section, we could end up writing the heap page to disk (with the bit cleared) and crashing before the visibility map page made it to disk. That would be bad. heap_insert has this correct, but somehow the order of operations got rearranged when heap_multi_insert was added. Also, add some more comments to visibilitymap_test, lazy_scan_heap, and IndexOnlyNext, expounding on concurrency issues. Per extensive code review by Andres Freund, and further review by Tom Lane, who also made the original report about the bogus warnings.
2012-01-25Instrument index-only scans to count heap fetches performed.Robert Haas
Patch by me; review by Tom Lane, Jeff Davis, and Peter Geoghegan.
2012-01-01Update copyright notices for year 2012.Bruce Momjian
2011-10-16Avoid assuming that index-only scan data matches the index's rowtype.Tom Lane
In general the data returned by an index-only scan should have the datatypes originally computed by FormIndexDatum. If the index opclasses use "storage" datatypes different from their input datatypes, the scan tuple will not have the same rowtype attributed to the index; but we had a hard-wired assumption that that was true in nodeIndexonlyscan.c. We'd already hacked around the issue for the one case where the types are different in btree indexes (btree name_ops), but this would definitely come back to bite us if we ever implement index-only scans in GiST. To fix, require the index AM to explicitly provide the tupdesc for the tuple it is returning. btree can just pass back the index's tupdesc, but GiST will have to work harder when and if it supports index-only scans. I had previously proposed fixing this by allowing the index AM to fill the scan tuple slot directly; but on reflection that seemed like a module layering violation, since TupleTableSlots are creatures of the executor. At least in the btree case, it would also be less efficient, since the tuple deconstruction work would occur even for rows later found to be invisible to the scan's snapshot.
2011-10-11Generate index-only scan tuple descriptor from the plan node's indextlist.Tom Lane
Dept. of second thoughts: as long as we've got that tlist hanging around anyway, we can apply ExecTypeFromTL to it to get a suitable descriptor for the ScanTupleSlot. This is a nicer solution than the previous one because it eliminates some hard-wired knowledge about btree name_ops, and because it avoids the somewhat shaky assumption that we needn't set up the scan tuple descriptor in EXPLAIN_ONLY mode. It doesn't change what actually happens at run-time though, and I'm still a bit nervous about that.
2011-10-11Rearrange the implementation of index-only scans.Tom Lane
This commit changes index-only scans so that data is read directly from the index tuple without first generating a faux heap tuple. The only immediate benefit is that indexes on system columns (such as OID) can be used in index-only scans, but this is necessary infrastructure if we are ever to support index-only scans on expression indexes. The executor is now ready for that, though the planner still needs substantial work to recognize the possibility. To do this, Vars in index-only plan nodes have to refer to index columns not heap columns. I introduced a new special varno, INDEX_VAR, to mark such Vars to avoid confusion. (In passing, this commit renames the two existing special varnos to OUTER_VAR and INNER_VAR.) This allows ruleutils.c to handle them with logic similar to what we use for subplan reference Vars. Since index-only scans are now fundamentally different from regular indexscans so far as their expression subtrees are concerned, I also chose to change them to have their own plan node type (and hence, their own executor source file).