diff options
| author | Dean Rasheed <dean.a.rasheed@gmail.com> | 2020-03-28 12:48:34 +0000 | 
|---|---|---|
| committer | Dean Rasheed <dean.a.rasheed@gmail.com> | 2020-03-28 12:48:34 +0000 | 
| commit | 87779aa47463d0fb3b4743a7d5c9534994bf7c98 (patch) | |
| tree | b583f175daf89ace14b4a523936d93d647509ce5 /src/backend/utils/error | |
| parent | 145cb16d3bd7d20dbf14507905404aa8153ebf2a (diff) | |
Prevent functional dependency estimates from exceeding column estimates.
Formerly we applied a functional dependency "a => b with dependency
degree f" using the formula
  P(a,b) = P(a) * [f + (1-f)*P(b)]
This leads to the possibility that the combined selectivity P(a,b)
could exceed P(b), which is not ideal. The addition of support for IN
and OR clauses (commits 8f321bd16c and ccaa3569f5) would seem to make
this more likely, since the user-supplied values in such clauses are
not necessarily compatible with the functional dependency.
Mitigate this by using the formula
  P(a,b) = f * Min(P(a), P(b)) + (1-f) * P(a) * P(b)
instead, which guarantees that the combined selectivity is less than
each column's individual selectivity. Logically, this is modifies the
part of the formula that accounts for dependent rows to handle cases
where P(a) > P(b), whilst not changing the second term which accounts
for independent rows.
Additionally, this refactors the way that functional dependencies are
applied, so now dependencies_clauselist_selectivity() estimates both
the implying clauses and the implied clauses for each functional
dependency (formerly only the implied clauses were estimated), and now
all clauses for each attribute are taken into account (formerly only
one clause for each implied attribute was estimated). This removes the
previously built-in assumption that only equality clauses will be
seen, which is no longer true, and opens up the possibility of
applying functional dependencies to more general clauses.
Patch by me, reviewed by Tomas Vondra.
Discussion: https://postgr.es/m/CAEZATCXaNFZyOhR4XXAfkvj1tibRBEjje6ZbXwqWUB_tqbH%3Drw%40mail.gmail.com
Discussion: https://postgr.es/m/20200318002946.6dvblukm3cfmgir2%40development
Diffstat (limited to 'src/backend/utils/error')
0 files changed, 0 insertions, 0 deletions
