From: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | David Grelaud <dgrelaud(at)ideolys(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Why PG uses nested-loop join when no indexes are available? |
Date: | 2016-01-15 01:16:10 |
Message-ID: | CAKJS1f-erAogxxd=4tr2rLukoy5M-Or+djyZr2rbmAnZ35T+eg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 15 January 2016 at 04:00, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> writes:
> > Perhaps separating out enable_nestloop so that it only disables
> > non-parameterised nested loops, and add another GUC for parameterised
> > nested loops would be a good thing to do. Likely setting enable_nestloop
> to
> > off in production would be a slightly easier decision to make, if that
> was
> > the case.
> > It looks pretty simple to do this, so I hacked it up, and attached it
> here.
> > There's no doc changes and I'm not that interested in fighting for this
> > change, it's more just an idea for consideration.
>
> I'm not terribly excited by this idea either. If making such a change
> actually makes things better for someone consistently, I'd argue that
> the problem is a mistaken cost estimate elsewhere, and we'd be better off
> to find and fix the real problem. (There have already been discussions
> of only believing single-row rowcount estimates when they're provably
> true, which might help if we can figure out how to do it cheaply enough.)
>
Actually, it's not very hard to hit a bad underestimate at all. All you
need is a join on two columns which are co-related. Since PostgreSQL
multiplies the estimated selectivities the row count is going to come out
too low. This also tricks the planner into thinking that this is a good
join to perform early, since (it thinks that) it does not produce many rows
at all. You only need 1 more join to occur after that to choose a nested
loop join mistakenly to hit the issue.
FWIW TPC-H Q9 has this exact trip hazard with the partsupp table, which is
the exact reason why this patch was born:
https://commitfest.postgresql.org/7/210/
I also think that the attitude that we can *always* fix the costs and
estimates is not the right one. The planner is never going to get it right
100% of the time. If we ever think we can build such a planner then someone
needs to come along and direct us back into the real world.
--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | 閬閬イふ | 2016-01-15 02:51:11 | about test_parser installation failure problem(PostgreSQL in 9.5.0)? |
Previous Message | Berend Tober | 2016-01-15 00:57:21 | Re: WIP: CoC V5, etc., etc., etc., etc., .... |