Re: New design for FK-based join selectivity estimation

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New design for FK-based join selectivity estimation
Date: 2016-06-13 17:28:36
Message-ID: CANP8+jLsiASHZ=5at_uw7wn7Ujkd6QVJeSf=w1qaoSOSKC3V4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4 June 2016 at 20:44, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> This is a branch of the discussion in
>
> https://www.postgresql.org/message-id/flat/20160429102531.GA13701%40huehner.biz
> but I'm starting a new thread as the original title is getting
> increasingly off-topic.
>
> I complained in that thread that the FK join selectivity patch had a
> very brute-force approach to matching join qual clauses to FK
> constraints, requiring a total of seven nested levels of looping to
> get anything done, and expensively rediscovering the same facts over
> and over. Here is a sketch of what I think is a better way:
>

Thanks for your review and design notes here, which look like good
improvements.

Tomas has been discussing that with myself and others, but I just realised
that might not be apparent on list, so just to mention there is activity on
this and new code will be published very soon.

On the above mentioned thread, Tomas' analysis was this...
https://www.postgresql.org/message-id/8344835e-18af-9d40-aed7-bd2261be9162%402ndquadrant.com
> There are probably a few reasonably simple things we could do - e.g.
ignore foreign keys
> with just a single column, as the primary goal of the patch is improving
estimates with
> multi-column foreign keys. I believe that covers a vast majority of
foreign keys in the wild.

I agree with that comment. The relcache code retrieves all FKs, even ones
that have a single column. Yet the planner code never uses them unless
nKeys>1. That was masked somewhat by my two commits, treating the info as
generic and then using only a very specific subset of it.

So a simple change is to make RelationGetFKeyList() only retrieve FKs with
nKeys>1. Rename to RelationGetMultiColumnFKeyList(). That greatly reduces
the scope for increased planning time.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-06-13 17:58:49 Re: ERROR: ORDER/GROUP BY expression not found in targetlist
Previous Message Tom Lane 2016-06-13 17:16:44 Re: proposal: integration bloat tables (indexes) to core