Re: Index is not used for "IN (non-correlated subquery)"

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: George <pinkisntwell(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Index is not used for "IN (non-correlated subquery)"
Date: 2016-11-30 16:42:32
Message-ID: 20748.1480524152@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

George <pinkisntwell(at)gmail(dot)com> writes:
> explain select * from wg3ppbm_transaction where partner_uuid in (
> select p.uuid
> from wg3ppbm_userpartner up
> join wg3ppbm_partner p on p.id = up.partner_id
> );

> "Hash Semi Join (cost=2.07..425.72 rows=2960 width=482)"
> " Hash Cond: ((wg3ppbm_transaction.partner_uuid)::text = (p.uuid)::text)"
> " -> Seq Scan on wg3ppbm_transaction (cost=0.00..375.19 rows=5919 width=482)"
> " -> Hash (cost=2.06..2.06 rows=1 width=37)"
> " -> Nested Loop (cost=0.00..2.06 rows=1 width=37)"
> " Join Filter: (up.partner_id = p.id)"
> " -> Seq Scan on wg3ppbm_userpartner up
> (cost=0.00..1.01 rows=1 width=4)"
> " -> Seq Scan on wg3ppbm_partner p (cost=0.00..1.02
> rows=2 width=41)"

This plan is expecting to have to return about half of the rows in
wg3ppbm_transaction, a situation for which an indexscan would NOT
be a better choice. The usual rule of thumb is that you need to be
retrieving at most one or two percent of a table's rows for an indexscan
on it to be faster than a seqscan.

I think however that the "half" may be a default estimate occasioned
by the other tables being empty and therefore not having any statistics.
Another rule of thumb is that the plans you get for tiny tables have
little to do with what happens once there's lots of data.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Merlin Moncure 2016-11-30 16:45:30 Re: Index is not used for "IN (non-correlated subquery)"
Previous Message Merlin Moncure 2016-11-30 16:41:39 Re: About the MONEY type