Re: Index is not used for "IN (non-correlated subquery)"

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: George <pinkisntwell(at)gmail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Index is not used for "IN (non-correlated subquery)"
Date: 2016-12-01 16:58:25
Message-ID: 24320.1480611505@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

George <pinkisntwell(at)gmail(dot)com> writes:
> explain analyze select *
> from wg3ppbm_transaction where partner_uuid in (
> select p.uuid
> from wg3ppbm_userpartner up
> join wg3ppbm_partner p on p.id = up.partner_id
> );

> "Hash Semi Join (cost=2.07..65628.14 rows=663727 width=380) (actual
> time=0.346..1542.730 rows=1 loops=1)"
> " Hash Cond: ((wg3ppbm_transaction.partner_uuid)::text = (p.uuid)::text)"
> " -> Seq Scan on wg3ppbm_transaction (cost=0.00..54757.54
> rows=1327454 width=380) (actual time=0.004..878.568 rows=1327587
> loops=1)"

So you're still getting the 50% default estimate, which is why it doesn't
want to use the index ...

> " -> Hash (cost=2.06..2.06 rows=1 width=37) (actual
> time=0.017..0.017 rows=1 loops=1)"
> " Buckets: 1024 Batches: 1 Memory Usage: 5kB"
> " -> Nested Loop (cost=0.00..2.06 rows=1 width=37) (actual
> time=0.011..0.012 rows=1 loops=1)"
> " Join Filter: (up.partner_id = p.id)"
> " Rows Removed by Join Filter: 1"
> " -> Seq Scan on wg3ppbm_userpartner up
> (cost=0.00..1.01 rows=1 width=4) (actual time=0.004..0.004 rows=1
> loops=1)"
> " -> Seq Scan on wg3ppbm_partner p (cost=0.00..1.02
> rows=2 width=41) (actual time=0.001..0.001 rows=2 loops=1)"

... and you still don't have any meaningful number of rows in
wg3ppbm_userpartner or wg3ppbm_partner. However, I don't understand how
it knows that there's only one or two rows in those tables and yet is
producing the stupid default estimate for the semijoin. I spent some time
trying to duplicate that behavior, without success. What PG version is
that, exactly? Have you vacuumed and/or analyzed those two tables? What
do you get for

select * from pg_stats where tablename = 'wg3ppbm_userpartner';

and likewise for wg3ppbm_partner?

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2016-12-01 17:01:54 Re: PostgreSQL ODBC driver for OSX 10.8
Previous Message Joshua D. Drake 2016-12-01 16:56:57 Re: Overwrite pg_catalog?