Re: PG choosing nested loop for set membership?

From: Brian Crowell <brian(at)fluggo(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-general(at)postgresql(dot)org >> PG-General Mailing List" <pgsql-general(at)postgresql(dot)org>
Subject: Re: PG choosing nested loop for set membership?
Date: 2014-03-26 15:01:33
Message-ID: CAAQkdDot3AoyFC1Jyrwkat2Jm=sbdEaaKPkVxhDdqMj7+-gC8w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Mar 25, 2014 at 5:59 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Yeah. The weird thing about that is that the nestloop rowcount estimate
> isn't the product of the two input rowcounts --- you'd sort of expect an
> estimate of 158 given the input-relation sizes. While that's not ipso
> facto evidence of a bug (because the estimates are arrived at in different
> ways), I'm having a hard time replicating it here. Are you using an
> up-to-date PG release?

All right, I think I'm onto something. But first I'll answer your questions.

Version is 9.3.3 from the Postgres Debian archives.

> One thing that might help is to increase the statistics target for
> pl2._visible_accounts_by_rule_set. The other two tables are small enough
> that you don't need to do that for them. (Although come to think of it,
> they are also small enough that maybe auto-analyze isn't triggering for
> them ... does a manual ANALYZE improve matters?)

You were right that auto-analyze didn't go after them. Weird. But a
few manual analyzes later, and no change.

Here's what I did, though. I collapsed the pl2.current_user view into
pl2.visible_accounts:

===
select
acc.account,
acc.manager,
acc.is_fund
from pl2._visible_accounts_by_rule_set acc
inner join (pl2._users u
left join pl2._users iu on u.impersonating = iu.user_id)
on acc.rule_set_id = coalesce(iu.permission_rule_set_id,
u.permission_rule_set_id)
where u.user_principal_name = session_user
===

I noticed that join-on-coalesce pattern that gave us trouble in SQL
Server. The query planner can't do a thing with that. So I rewrote the
query so the last join would be solid:

===
select
acc.account,
acc.manager,
acc.is_fund
from pl2._users lu
inner join pl2._users u on u.user_id = coalesce(lu.impersonating,
lu.user_id)
inner join pl2._visible_accounts_by_rule_set acc
on acc.rule_set_id = u.permission_rule_set_id
where lu.user_principal_name = session_user
===

The join order is the same, and the indexes used are the same, but the
estimate is much better:

'Nested Loop (cost=0.68..13.70 rows=133 width=10) (actual
time=0.073..0.211 rows=241 loops=1)'
' Output: acc.account, acc.manager, acc.is_fund'
' Buffers: shared hit=10'
' -> Nested Loop (cost=0.54..8.58 rows=1 width=4) (actual
time=0.056..0.059 rows=1 loops=1)'
' Output: u.permission_rule_set_id'
' Buffers: shared hit=7'
' -> Index Scan using _pl2_users_user_principal_name_idx on
pl2._users lu (cost=0.27..4.29 rows=1 width=8) (actual
time=0.045..0.047 rows=1 loops=1)'
' Output: lu.user_id, lu.user_principal_name, lu.name,
lu.permission_rule_set_id, lu.impersonating, lu.is_admin'
' Index Cond: (lu.user_principal_name = ("session_user"())::text)'
' Buffers: shared hit=4'
' -> Index Scan using _users_pkey on pl2._users u
(cost=0.27..4.29 rows=1 width=8) (actual time=0.006..0.006 rows=1
loops=1)'
' Output: u.user_id, u.user_principal_name, u.name,
u.permission_rule_set_id, u.impersonating, u.is_admin'
' Index Cond: (u.user_id = COALESCE(lu.impersonating, lu.user_id))'
' Buffers: shared hit=3'
' -> Index Scan using _visible_accounts_by_rule_set_idx on
pl2._visible_accounts_by_rule_set acc (cost=0.15..3.54 rows=158
width=14) (actual time=0.015..0.089 rows=241 loops=1)'
' Output: acc.rule_set_id, acc.account, acc.manager, acc.is_fund'
' Index Cond: (acc.rule_set_id = u.permission_rule_set_id)'
' Buffers: shared hit=3'
'Total runtime: 0.297 ms'

I'll see if I can write an isolated test case for the coalesce
misestimate. Or do you think the query planner will ever be able to do
anything with that form?

--Brian

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2014-03-26 15:23:57 Re: PG choosing nested loop for set membership?
Previous Message Adrian Klaver 2014-03-26 14:05:10 Re: Failure upgrading PG 9.2 to 9.3