| 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
 | Summary
-------
These directories take the Query structure returned by the parser, and
generate a plan used by the executor.  The /plan directory generates the
actual output plan, the /path code generates all possible ways to join the
tables, and /prep handles various preprocessing steps for special cases.
/util is utility stuff.  /geqo is the separate "genetic optimization" planner
--- it does a semi-random search through the join tree space, rather than
exhaustively considering all possible join trees.  (But each join considered
by /geqo is given to /path to create paths for, so we consider all possible
implementation paths for each specific join pair even in GEQO mode.)
Paths and Join Pairs
--------------------
During the planning/optimizing process, we build "Path" trees representing
the different ways of doing a query.  We select the cheapest Path that
generates the desired relation and turn it into a Plan to pass to the
executor.  (There is pretty much a one-to-one correspondence between the
Path and Plan trees, but Path nodes omit info that won't be needed during
planning, and include info needed for planning that won't be needed by the
executor.)
The optimizer builds a RelOptInfo structure for each base relation used in
the query.  Base rels are either primitive tables, or subquery subselects
that are planned via a separate recursive invocation of the planner.  A
RelOptInfo is also built for each join relation that is considered during
planning.  A join rel is simply a combination of base rels.  There is only
one join RelOptInfo for any given set of baserels --- for example, the join
{A B C} is represented by the same RelOptInfo no matter whether we build it
by joining A and B first and then adding C, or joining B and C first and
then adding A, etc.  These different means of building the joinrel are
represented as Paths.  For each RelOptInfo we build a list of Paths that
represent plausible ways to implement the scan or join of that relation.
Once we've considered all the plausible Paths for a rel, we select the one
that is cheapest according to the planner's cost estimates.  The final plan
is derived from the cheapest Path for the RelOptInfo that includes all the
base rels of the query.
Possible Paths for a primitive table relation include plain old sequential
scan, plus index scans for any indexes that exist on the table, plus bitmap
index scans using one or more indexes.  A subquery base relation just has
one Path, a "SubqueryScan" path (which links to the subplan that was built
by a recursive invocation of the planner).  Likewise a function-RTE base
relation has only one possible Path.
Joins always occur using two RelOptInfos.  One is outer, the other inner.
Outers drive lookups of values in the inner.  In a nested loop, lookups of
values in the inner occur by scanning the inner path once per outer tuple
to find each matching inner row.  In a mergejoin, inner and outer rows are
ordered, and are accessed in order, so only one scan is required to perform
the entire join: both inner and outer paths are scanned in-sync.  (There's
not a lot of difference between inner and outer in a mergejoin...)  In a
hashjoin, the inner is scanned first and all its rows are entered in a
hashtable, then the outer is scanned and for each row we lookup the join
key in the hashtable.
A Path for a join relation is actually a tree structure, with the top
Path node representing the join method.  It has left and right subpaths
that represent the scan or join methods used for the two input relations.
Join Tree Construction
----------------------
The optimizer generates optimal query plans by doing a more-or-less
exhaustive search through the ways of executing the query.  The best Path
tree is found by a recursive process:
1) Take each base relation in the query, and make a RelOptInfo structure
for it.  Find each potentially useful way of accessing the relation,
including sequential and index scans, and make a Path representing that
way.  All the Paths made for a given relation are placed in its
RelOptInfo.pathlist.  (Actually, we discard Paths that are obviously
inferior alternatives before they ever get into the pathlist --- what
ends up in the pathlist is the cheapest way of generating each potentially
useful sort ordering of the relation.)  Also create a RelOptInfo.joininfo
list including all the join clauses that involve this relation.  For
example, the WHERE clause "tab1.col1 = tab2.col1" generates entries in
both tab1 and tab2's joininfo lists.
If we have only a single base relation in the query, we are done.
Otherwise we have to figure out how to join the base relations into a
single join relation.
2) Normally, any explicit JOIN clauses are "flattened" so that we just
have a list of relations to join.  However, FULL OUTER JOIN clauses are
never flattened, and other kinds of JOIN might not be either, if the
flattening process is stopped by join_collapse_limit or from_collapse_limit
restrictions.  Therefore, we end up with a planning problem that contains
both lists of relations to be joined in any order, and JOIN nodes that
force a particular join order.  For each un-flattened JOIN node, we join
exactly that pair of relations (after recursively planning their inputs,
if the inputs aren't single base relations).  We generate a Path for each
feasible join method, and select the cheapest Path.  Note that the JOIN
clause structure determines the join Path structure, but it doesn't
constrain the join implementation method at each join (nestloop, merge,
hash), nor does it say which rel is considered outer or inner at each
join.  We consider all these possibilities in building Paths.
3) At the top level of the FROM clause we will have a list of relations
that are either base rels or joinrels constructed per un-flattened JOIN
directives.  (This is also the situation, recursively, when we can flatten
sub-joins underneath an un-flattenable JOIN into a list of relations to
join.)  We can join these rels together in any order the planner sees fit.
The standard (non-GEQO) planner does this as follows:
Consider joining each RelOptInfo to each other RelOptInfo specified in its
RelOptInfo.joininfo, and generate a Path for each possible join method for
each such pair.  (If we have a RelOptInfo with no join clauses, we have no
choice but to generate a clauseless Cartesian-product join; so we consider
joining that rel to each other available rel.  But in the presence of join
clauses we will only consider joins that use available join clauses.)
If we only had two relations in the FROM list, we are done: we just pick
the cheapest path for the join RelOptInfo.  If we had more than two, we now
need to consider ways of joining join RelOptInfos to each other to make
join RelOptInfos that represent more than two FROM items.
The join tree is constructed using a "dynamic programming" algorithm:
in the first pass (already described) we consider ways to create join rels
representing exactly two FROM items.  The second pass considers ways
to make join rels that represent exactly three FROM items; the next pass,
four items, etc.  The last pass considers how to make the final join
relation that includes all FROM items --- obviously there can be only one
join rel at this top level, whereas there can be more than one join rel
at lower levels.  At each level we use joins that follow available join
clauses, if possible, just as described for the first level.
For example:
    SELECT  *
    FROM    tab1, tab2, tab3, tab4
    WHERE   tab1.col = tab2.col AND
        tab2.col = tab3.col AND
        tab3.col = tab4.col
    Tables 1, 2, 3, and 4 are joined as:
    {1 2},{2 3},{3 4}
    {1 2 3},{2 3 4}
    {1 2 3 4}
    (other possibilities will be excluded for lack of join clauses)
    SELECT  *
    FROM    tab1, tab2, tab3, tab4
    WHERE   tab1.col = tab2.col AND
        tab1.col = tab3.col AND
        tab1.col = tab4.col
    Tables 1, 2, 3, and 4 are joined as:
    {1 2},{1 3},{1 4}
    {1 2 3},{1 3 4},{1 2 4}
    {1 2 3 4}
We consider left-handed plans (the outer rel of an upper join is a joinrel,
but the inner is always a single FROM item); right-handed plans (outer rel
is always a single item); and bushy plans (both inner and outer can be
joins themselves).  For example, when building {1 2 3 4} we consider
joining {1 2 3} to {4} (left-handed), {4} to {1 2 3} (right-handed), and
{1 2} to {3 4} (bushy), among other choices.  Although the jointree
scanning code produces these potential join combinations one at a time,
all the ways to produce the same set of joined base rels will share the
same RelOptInfo, so the paths produced from different join combinations
that produce equivalent joinrels will compete in add_path().
Once we have built the final join rel, we use either the cheapest path
for it or the cheapest path with the desired ordering (if that's cheaper
than applying a sort to the cheapest other path).
If the query contains one-sided outer joins (LEFT or RIGHT joins), or
"IN (sub-select)" WHERE clauses that were converted to joins, then some of
the possible join orders may be illegal.  These are excluded by having
make_join_rel consult side lists of outer joins and IN joins to see
whether a proposed join is illegal.  (The same consultation allows it
to see which join style should be applied for a valid join, ie,
JOIN_INNER, JOIN_LEFT, etc.)
Valid OUTER JOIN optimizations
------------------------------
The planner's treatment of outer join reordering is based on the following
identities:
1.	(A leftjoin B on (Pab)) innerjoin C on (Pac)
	= (A innerjoin C on (Pac)) leftjoin B on (Pab)
where Pac is a predicate referencing A and C, etc (in this case, clearly
Pac cannot reference B, or the transformation is nonsensical).
2.	(A leftjoin B on (Pab)) leftjoin C on (Pac)
	= (A leftjoin C on (Pac)) leftjoin B on (Pab)
3.	(A leftjoin B on (Pab)) leftjoin C on (Pbc)
	= A leftjoin (B leftjoin C on (Pbc)) on (Pab)
Identity 3 only holds if predicate Pbc must fail for all-null B rows
(that is, Pbc is strict for at least one column of B).  If Pbc is not
strict, the first form might produce some rows with nonnull C columns
where the second form would make those entries null.
RIGHT JOIN is equivalent to LEFT JOIN after switching the two input
tables, so the same identities work for right joins.  Only FULL JOIN
cannot be re-ordered at all.
An example of a case that does *not* work is moving an innerjoin into or
out of the nullable side of an outer join:
	A leftjoin (B join C on (Pbc)) on (Pab)
	!= (A leftjoin B on (Pab)) join C on (Pbc)
FULL JOIN ordering is enforced by not collapsing FULL JOIN nodes when
translating the jointree to "joinlist" representation.  LEFT and RIGHT
JOIN nodes are normally collapsed so that they participate fully in the
join order search.  To avoid generating illegal join orders, the planner
creates an OuterJoinInfo node for each outer join, and make_join_rel
checks this list to decide if a proposed join is legal.
What we store in OuterJoinInfo nodes are the minimum sets of Relids
required on each side of the join to form the outer join.  Note that
these are minimums; there's no explicit maximum, since joining other
rels to the OJ's syntactic rels may be legal.  Per identities 1 and 2,
non-FULL joins can be freely associated into the lefthand side of an
OJ, but in general they can't be associated into the righthand side.
So the restriction enforced by make_join_rel is that a proposed join
can't join across a RHS boundary (ie, join anything inside the RHS
to anything else) unless the join validly implements some outer join.
(To support use of identity 3, we have to allow cases where an apparent
violation of a lower OJ's RHS is committed while forming an upper OJ.
If this wouldn't in fact be legal, the upper OJ's minimum LHS or RHS
set must be expanded to include the whole of the lower OJ, thereby
preventing it from being formed before the lower OJ is.)
Pulling up subqueries
---------------------
As we described above, a subquery appearing in the range table is planned
independently and treated as a "black box" during planning of the outer
query.  This is necessary when the subquery uses features such as
aggregates, GROUP, or DISTINCT.  But if the subquery is just a simple
scan or join, treating the subquery as a black box may produce a poor plan
compared to considering it as part of the entire plan search space.
Therefore, at the start of the planning process the planner looks for
simple subqueries and pulls them up into the main query's jointree.
Pulling up a subquery may result in FROM-list joins appearing below the top
of the join tree.  Each FROM-list is planned using the dynamic-programming
search method described above.
If pulling up a subquery produces a FROM-list as a direct child of another
FROM-list, then we can merge the two FROM-lists together.  Once that's
done, the subquery is an absolutely integral part of the outer query and
will not constrain the join tree search space at all.  However, that could
result in unpleasant growth of planning time, since the dynamic-programming
search has runtime exponential in the number of FROM-items considered.
Therefore, we don't merge FROM-lists if the result would have too many
FROM-items in one list.
Optimizer Functions
-------------------
The primary entry point is planner().
planner()
 set up for recursive handling of subqueries
 do final cleanup after planning.
-subquery_planner()
 pull up subqueries from rangetable, if possible
 canonicalize qual
     Attempt to simplify WHERE clause to the most useful form; this includes
     flattening nested AND/ORs and detecting clauses that are duplicated in
     different branches of an OR.
 simplify constant expressions
 process sublinks
 convert Vars of outer query levels into Params
--grouping_planner()
  preprocess target list for non-SELECT queries
  handle UNION/INTERSECT/EXCEPT, GROUP BY, HAVING, aggregates,
	ORDER BY, DISTINCT, LIMIT
--query_planner()
   pull out constant quals, which can be used to gate execution of the
	whole plan (if any are found, we make a top-level Result node
	to do the gating)
   make list of base relations used in query
   split up the qual into restrictions (a=1) and joins (b=c)
   find qual clauses that enable merge and hash joins
----make_one_rel()
     set_base_rel_pathlist()
      find scan and all index paths for each base relation
      find selectivity of columns used in joins
-----make_one_rel_by_joins()
      jump to geqo if needed
      else call make_rels_by_joins() for each level of join tree needed
      make_rels_by_joins():
        For each joinrel of the prior level, do make_rels_by_clause_joins()
        if it has join clauses, or make_rels_by_clauseless_joins() if not.
        Also generate "bushy plan" joins between joinrels of lower levels.
      Back at make_one_rel_by_joins(), apply set_cheapest() to extract the
      cheapest path for each newly constructed joinrel.
      Loop back if this wasn't the top join level.
   Back at query_planner:
    put back any constant quals by adding a Result node
 Back at grouping_planner:
 do grouping(GROUP)
 do aggregates
 make unique(DISTINCT)
 make sort(ORDER BY)
 make limit(LIMIT/OFFSET)
Optimizer Data Structures
-------------------------
PlannerInfo     - global information for planning a particular Query
RelOptInfo      - a relation or joined relations
 RestrictInfo   - WHERE clauses, like "x = 3" or "y = z"
                  (note the same structure is used for restriction and
                   join clauses)
 Path           - every way to generate a RelOptInfo(sequential,index,joins)
  SeqScan       - a plain Path node with pathtype = T_SeqScan
  IndexPath     - index scans
  BitmapHeapPath - top of a bitmapped index scan
  TidPath       - scan by CTID
  AppendPath    - append multiple subpaths together
  ResultPath    - a Result plan node (used for FROM-less SELECT)
  MaterialPath  - a Material plan node
  UniquePath    - remove duplicate rows
  NestPath      - nested-loop joins
  MergePath     - merge joins
  HashPath      - hash joins
 PathKeys       - a data structure representing the ordering of a path
The optimizer spends a good deal of its time worrying about the ordering
of the tuples returned by a path.  The reason this is useful is that by
knowing the sort ordering of a path, we may be able to use that path as
the left or right input of a mergejoin and avoid an explicit sort step.
Nestloops and hash joins don't really care what the order of their inputs
is, but mergejoin needs suitably ordered inputs.  Therefore, all paths
generated during the optimization process are marked with their sort order
(to the extent that it is known) for possible use by a higher-level merge.
It is also possible to avoid an explicit sort step to implement a user's
ORDER BY clause if the final path has the right ordering already, so the
sort ordering is of interest even at the top level.  query_planner() will
look for the cheapest path with a sort order matching the desired order,
and grouping_planner() will compare its cost to the cost of using the
cheapest-overall path and doing an explicit sort.
When we are generating paths for a particular RelOptInfo, we discard a path
if it is more expensive than another known path that has the same or better
sort order.  We will never discard a path that is the only known way to
achieve a given sort order (without an explicit sort, that is).  In this
way, the next level up will have the maximum freedom to build mergejoins
without sorting, since it can pick from any of the paths retained for its
inputs.
PathKeys
--------
The PathKeys data structure represents what is known about the sort order
of a particular Path.
Path.pathkeys is a List of Lists of PathKeyItem nodes that represent
the sort order of the result generated by the Path.  The n'th sublist
represents the n'th sort key of the result.
In single/base relation RelOptInfo's, the Paths represent various ways
of scanning the relation and the resulting ordering of the tuples.
Sequential scan Paths have NIL pathkeys, indicating no known ordering.
Index scans have Path.pathkeys that represent the chosen index's ordering,
if any.  A single-key index would create a pathkey with a single sublist,
e.g. ( (tab1.indexkey1/sortop1) ).  A multi-key index generates a sublist
per key, e.g. ( (tab1.indexkey1/sortop1) (tab1.indexkey2/sortop2) ) which
shows major sort by indexkey1 (ordering by sortop1) and minor sort by
indexkey2 with sortop2.
Note that a multi-pass indexscan (OR clause scan) has NIL pathkeys since
we can say nothing about the overall order of its result.  Also, an
indexscan on an unordered type of index generates NIL pathkeys.  However,
we can always create a pathkey by doing an explicit sort.  The pathkeys
for a Sort plan's output just represent the sort key fields and the
ordering operators used.
Things get more interesting when we consider joins.  Suppose we do a
mergejoin between A and B using the mergeclause A.X = B.Y.  The output
of the mergejoin is sorted by X --- but it is also sorted by Y.  We
represent this fact by listing both keys in a single pathkey sublist:
( (A.X/xsortop B.Y/ysortop) ).  This pathkey asserts that the major
sort order of the Path can be taken to be *either* A.X or B.Y.
They are equal, so they are both primary sort keys.  By doing this,
we allow future joins to use either var as a pre-sorted key, so upper
Mergejoins may be able to avoid having to re-sort the Path.  This is
why pathkeys is a List of Lists.
We keep a sortop associated with each PathKeyItem because cross-data-type
mergejoins are possible; for example int4 = int8 is mergejoinable.
In this case we need to remember that the left var is ordered by int4lt
while the right var is ordered by int8lt.  So the different members of
each sublist could have different sortops.
Note that while the order of the top list is meaningful (primary vs.
secondary sort key), the order of each sublist is arbitrary.  Each sublist
should be regarded as a set of equivalent keys, with no significance
to the list order.
With a little further thought, it becomes apparent that pathkeys for
joins need not only come from mergejoins.  For example, if we do a
nestloop join between outer relation A and inner relation B, then any
pathkeys relevant to A are still valid for the join result: we have
not altered the order of the tuples from A.  Even more interesting,
if there was a mergeclause (more formally, an "equijoin clause") A.X=B.Y,
and A.X was a pathkey for the outer relation A, then we can assert that
B.Y is a pathkey for the join result; X was ordered before and still is,
and the joined values of Y are equal to the joined values of X, so Y
must now be ordered too.  This is true even though we used neither an
explicit sort nor a mergejoin on Y.
More generally, whenever we have an equijoin clause A.X = B.Y and a
pathkey A.X, we can add B.Y to that pathkey if B is part of the joined
relation the pathkey is for, *no matter how we formed the join*.  It works
as long as the clause has been applied at some point while forming the
join relation.  (In the current implementation, we always apply qual
clauses as soon as possible, ie, as far down in the plan tree as possible.
So we can treat the pathkeys as equivalent everywhere.  The exception is
when the relations A and B are joined inside the nullable side of an
OUTER JOIN and the equijoin clause comes from above the OUTER JOIN.  In this
case we cannot apply the qual as soon as A and B are joined, so we do not
consider the pathkeys to be equivalent.  This could be improved if we wanted
to go to the trouble of making pathkey equivalence be context-dependent,
but that seems much more complex than it's worth.)
In short, then: when producing the pathkeys for a merge or nestloop join,
we can keep all of the keys of the outer path, since the ordering of the
outer path will be preserved in the result.  Furthermore, we can add to
each pathkey sublist any inner vars that are equijoined to any of the
outer vars in the sublist; this works regardless of whether we are
implementing the join using that equijoin clause as a mergeclause,
or merely enforcing the clause after-the-fact as a qpqual filter.
Although Hashjoins also work only with equijoin operators, it is *not*
safe to consider the output of a Hashjoin to be sorted in any particular
order --- not even the outer path's order.  This is true because the
executor might have to split the join into multiple batches.  Therefore
a Hashjoin is always given NIL pathkeys.  (Also, we need to use only
mergejoinable operators when deducing which inner vars are now sorted,
because a mergejoin operator tells us which left- and right-datatype
sortops can be considered equivalent, whereas a hashjoin operator
doesn't imply anything about sort order.)
Pathkeys are also useful to represent an ordering that we wish to achieve,
since they are easily compared to the pathkeys of a potential candidate
path.  So, SortClause lists are turned into pathkeys lists for use inside
the optimizer.
OK, now for how it *really* works:
We did implement pathkeys just as described above, and found that the
planner spent a huge amount of time comparing pathkeys, because the
representation of pathkeys as unordered lists made it expensive to decide
whether two were equal or not.  So, we've modified the representation
as described next.
If we scan the WHERE clause for equijoin clauses (mergejoinable clauses)
during planner startup, we can construct lists of equivalent pathkey items
for the query.  There could be more than two items per equivalence set;
for example, WHERE A.X = B.Y AND B.Y = C.Z AND D.R = E.S creates the
equivalence sets { A.X B.Y C.Z } and { D.R E.S } (plus associated sortops).
Any pathkey item that belongs to an equivalence set implies that all the
other items in its set apply to the relation too, or at least all the ones
that are for fields present in the relation.  (Some of the items in the
set might be for as-yet-unjoined relations.)  Furthermore, any multi-item
pathkey sublist that appears at any stage of planning the query *must* be
a subset of one or another of these equivalence sets; there's no way we'd
have put two items in the same pathkey sublist unless they were equijoined
in WHERE.
Now suppose that we allow a pathkey sublist to contain pathkey items for
vars that are not yet part of the pathkey's relation.  This introduces
no logical difficulty, because such items can easily be seen to be
irrelevant; we just mandate that they be ignored.  But having allowed
this, we can declare (by fiat) that any multiple-item pathkey sublist
must be "equal()" to the appropriate equivalence set.  In effect,
whenever we make a pathkey sublist that mentions any var appearing in an
equivalence set, we instantly add all the other vars equivalenced to it,
whether they appear yet in the pathkey's relation or not.  And we also
mandate that the pathkey sublist appear in the same order as the
equivalence set it comes from.
In fact, we can go even further, and say that the canonical representation
of a pathkey sublist is a pointer directly to the relevant equivalence set,
which is kept in a list of pathkey equivalence sets for the query.  Then
pathkey sublist comparison reduces to pointer-equality checking!  To do this
we also have to add single-element pathkey sublists to the query's list of
equivalence sets, but that's a small price to pay.
By the way, it's OK and even useful for us to build equivalence sets
that mention multiple vars from the same relation.  For example, if
we have WHERE A.X = A.Y and we are scanning A using an index on X,
we can legitimately conclude that the path is sorted by Y as well;
and this could be handy if Y is the variable used in other join clauses
or ORDER BY.  So, any WHERE clause with a mergejoinable operator can
contribute to an equivalence set, even if it's not a join clause.
As sketched so far, equijoin operators allow us to conclude that
A.X = B.Y and B.Y = C.Z together imply A.X = C.Z, even when different
datatypes are involved.  What is not immediately obvious is that to use
the "canonical pathkey" representation, we *must* make this deduction.
An example (from a real bug in Postgres 7.0) is a mergejoin for a query
like
	SELECT * FROM t1, t2 WHERE t1.f2 = t2.f3 AND t1.f1 = t2.f3;
The canonical-pathkey mechanism is able to deduce that t1.f1 = t1.f2
(ie, both appear in the same canonical pathkey set).  If we sort t1
and then apply a mergejoin, we *must* filter the t1 tuples using the
implied qualification f1 = f2, because otherwise the output of the sort
will be ordered by f1 or f2 (whichever we sort on) but not both.  The
merge will then fail since (depending on which qual clause it applies
first) it's expecting either ORDER BY f1,f2 or ORDER BY f2,f1, but the
actual output of the sort has neither of these orderings.  The best fix
for this is to generate all the implied equality constraints for each
equijoin set and add these clauses to the query's qualification list.
In other words, we *explicitly* deduce f1 = f2 and add this to the WHERE
clause.  The constraint will be applied as a qpqual to the output of the
scan on t1, resulting in sort output that is indeed ordered by both vars.
This approach provides more information to the selectivity estimation
code than it would otherwise have, and reduces the number of tuples
processed in join stages, so it's a win to make these deductions even
if we weren't forced to.
When we generate implied equality constraints, we may find ourselves
adding redundant clauses to specific relations.  For example, consider
	SELECT * FROM t1, t2, t3 WHERE t1.a = t2.b AND t2.b = t3.c;
We will generate the implied clause t1.a = t3.c and add it to the tree.
This is good since it allows us to consider joining t1 and t3 directly,
which we otherwise wouldn't do.  But when we reach the stage of joining
all three relations, we will have redundant join clauses --- eg, if we
join t1 and t2 first, then the path that joins (t1 t2) to t3 will have
both t2.b = t3.c and t1.a = t3.c as restriction clauses.  This is bad;
not only is evaluation of the extra clause useless work at runtime,
but the selectivity estimator routines will underestimate the number
of tuples produced since they won't know that the two clauses are
perfectly redundant.  We fix this by detecting and removing redundant
clauses as the restriction clause list is built for each join.  (We
can't do it sooner, since which clauses are redundant will vary depending
on the join order.)
Yet another implication of all this is that mergejoinable operators
must form closed equivalence sets.  For example, if "int2 = int4"
and "int4 = int8" are both marked mergejoinable, then there had better
be a mergejoinable "int2 = int8" operator as well.  Otherwise, when
we're given WHERE int2var = int4var AND int4var = int8var, we'll fail
while trying to create a representation of the implied clause
int2var = int8var.
An additional refinement we can make is to insist that canonical pathkey
lists (sort orderings) do not mention the same pathkey set more than once.
For example, a pathkey list ((A) (B) (A)) is redundant --- the second
occurrence of (A) does not change the ordering, since the data must already
be sorted by A.  Although a user probably wouldn't write ORDER BY A,B,A
directly, such redundancies are more probable once equijoin equivalences
have been considered.  Also, the system is likely to generate redundant
pathkey lists when computing the sort ordering needed for a mergejoin.  By
eliminating the redundancy, we save time and improve planning, since the
planner will more easily recognize equivalent orderings as being equivalent.
Though Bob Devine <bob.devine@worldnet.att.net> was not involved in the 
coding of our optimizer, he is available to field questions about
optimizer topics.
-- bjm & tgl
 |