Repair logic flaw in cost estimator: cost_nestloop() was estimating CPU

costs using the inner path's parent->rows count as the number of tuples processed per inner scan iteration. This is wrong when we are using an inner indexscan with indexquals based on join clauses, because the rows count in a Relation node reflects the selectivity of the restriction clauses for that rel only. Upshot was that if join clause was very selective, we'd drastically overestimate the true cost of the join. Fix is to calculate correct output-rows estimate for an inner indexscan when the IndexPath node is created and save it in the path node. Change of path node doesn't require initdb, since path nodes don't appear in saved rules.
author: Tom Lane <tgl@sss.pgh.pa.us> 2000-03-22 22:08:35 +0000
committer: Tom Lane <tgl@sss.pgh.pa.us> 2000-03-22 22:08:35 +0000
commit: 1d5e7a6f46f799628392fc4a024a3d61e3dd1630 (patch)
tree: 991172a18881641faae08ec32d986d280d45f602 /src/backend/optimizer/path/indxpath.c
parent: d825e55c1315baeea58a3752b86a2a1c4c77e03c (diff)
1 files changed, 21 insertions, 1 deletions
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index edb16ce0d6d..8c63d9e1c38 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -9,7 +9,7 @@
  *
  *
  * IDENTIFICATION
- *	  $Header: /cvsroot/pgsql/src/backend/optimizer/path/indxpath.c,v 1.80 2000/02/15 20:49:16 tgl Exp $
+ *	  $Header: /cvsroot/pgsql/src/backend/optimizer/path/indxpath.c,v 1.81 2000/03/22 22:08:33 tgl Exp $
  *
  *-------------------------------------------------------------------------
  */
@@ -1454,6 +1454,26 @@ index_innerjoin(Query *root, RelOptInfo *rel, IndexOptInfo *index,
 		/* joinrelids saves the rels needed on the outer side of the join */
 		pathnode->joinrelids = lfirst(outerrelids_list);
 
+		/*
+		 * We must compute the estimated number of output rows for the
+		 * indexscan.  This is less than rel->rows because of the additional
+		 * selectivity of the join clauses.  Since clausegroup may contain
+		 * both restriction and join clauses, we have to do a set union to
+		 * get the full set of clauses that must be considered to compute
+		 * the correct selectivity.  (We can't just nconc the two lists;
+		 * then we might have some restriction clauses appearing twice,
+		 * which'd mislead restrictlist_selectivity into double-counting
+		 * their selectivity.)
+		 */
+		pathnode->rows = rel->tuples *
+			restrictlist_selectivity(root,
+									 LispUnion(rel->baserestrictinfo,
+											   clausegroup),
+									 lfirsti(rel->relids));
+		/* Like costsize.c, force estimate to be at least one row */
+		if (pathnode->rows < 1.0)
+			pathnode->rows = 1.0;
+
 		cost_index(&pathnode->path, root, rel, index, indexquals, true);
 
 		path_list = lappend(path_list, pathnode);
author	Tom Lane <tgl@sss.pgh.pa.us>	2000-03-22 22:08:35 +0000
committer	Tom Lane <tgl@sss.pgh.pa.us>	2000-03-22 22:08:35 +0000
commit	1d5e7a6f46f799628392fc4a024a3d61e3dd1630 (patch)
tree	991172a18881641faae08ec32d986d280d45f602 /src/backend/optimizer/path/indxpath.c
parent	d825e55c1315baeea58a3752b86a2a1c4c77e03c (diff)