Re: nested loop semijoin estimates

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: nested loop semijoin estimates
Date: 2015-05-31 23:34:59
Message-ID: 556B9AA3.2020001@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06/01/15 00:08, Tom Lane wrote:
> Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
>> On 05/30/15 23:16, Tom Lane wrote:
>>> Attached is a draft patch for that. It fixes the problem for me:
>
>> Seems to be working OK, but I still do get a Bitmap Heap Scan there (but
>> more about that later).
>
> Attached is an incremental patch (on top of the previous one) to
> allow startup cost of parameterized paths to be considered when the
> relation is the RHS of a semi or anti join. It seems reasonably clean
> except for one thing: logically, we perhaps should remove the checks
> on path->param_info from the last half of
> compare_path_costs_fuzzily(), so as to increase the symmetry between
> parameterized paths and unparameterized ones. However, when I did
> that, I got changes in some of the regression test plans, and they
> didn't seem to be for the better. So I left that alone. As-is, this
> patch doesn't seem to affect the results for any existing regression
> tests.

Seems to be working fine. I've tried a bunch of queries modifying the
test case in various ways, and all seem to be planned fine. I've been
unable to come up with a query that'd get planned badly.

Regarding the remaining checks in compare_path_costs_fuzzily(), isn't
that simply caused by very small data sets? For example the first
"failing" plan in join.sql looks like this:

Nested Loop Left Join (cost=0.29..22.80 rows=2 width=12)

Nested Loop Left Join (cost=0.29..22.80 rows=2 width=12)
Output: "*VALUES*".column1, i1.f1, (666)
Join Filter: ("*VALUES*".column1 = i1.f1)
-> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=4)
Output: "*VALUES*".column1
-> Materialize (cost=0.29..22.64 rows=5 width=8)
Output: i1.f1, (666)
-> Nested Loop Left Join (cost=0.29..22.61 rows=5 width=8)
Output: i1.f1, 666
-> Seq Scan on public.int4_tbl i1 (cost=0.00..1.05 ...
Output: i1.f1
-> Index Only Scan using tenk1_unique2 on public....
Output: i2.unique2
Index Cond: (i2.unique2 = i1.f1)

while with the changes it'd look like this:

Hash Right Join (cost=0.34..22.70 rows=2 width=12)
Output: "*VALUES*".column1, i1.f1, (666)
Hash Cond: (i1.f1 = "*VALUES*".column1)
-> Nested Loop Left Join (cost=0.29..22.61 rows=5 width=8)
Output: i1.f1, 666
-> Seq Scan on public.int4_tbl i1 (cost=0.00..1.05 ...
Output: i1.f1
-> Index Only Scan using tenk1_unique2 on public.tenk1 ...
Output: i2.unique2
Index Cond: (i2.unique2 = i1.f1)
-> Hash (cost=0.03..0.03 rows=2 width=4)
Output: "*VALUES*".column1
-> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 ...
Output: "*VALUES*".column1
(14 rows)

So the planner actually believes the plan to be cheaper, although only
by a tiny margin. And I see pretty much no difference in planning/exec
time (but I'm on a machine with power-management and VMs, so a lot of
random noise).

But once the int4_tbl gets bigger (say, 160k rows instead of the 5),
even the current the hash join clearly wins. Actually, it switches from
the current plan way sooner (at 500 rows it's already using the hash
join, and I see about the same timings).

I don't really see why you think those plan changes to be bad? And even
if they are, isn't that simply a matter of tuning the cost parameters?

>> Do you plan to push that into 9.5, or 9.6? I assume it's a
>> behavior change so that no back-patching, right?
>
> Mumble. It's definitely a planner bug fix, and I believe that the
> effects are narrowly constrained to cases where significantly better
> plans are possible. So personally I'd be willing to back-patch it.
> But others might see that differently, especially since it's been
> like this for a long time and we only have one field complaint.

+1 to back-patching from me. It's true we only have one field complaint,
but I believe there are more users impacted by this. They just didn't
notice - we only really got this complaint because of the plan
discrepancy between queries that are almost exactly the same.

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-06-01 01:17:32 Re: [CORE] postpone next week's release
Previous Message Tom Lane 2015-05-31 22:08:03 Re: nested loop semijoin estimates