From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Dmitry Lazurkin <dilaz03(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Perfomance of IN-clause with many elements and possible solutions |
Date: | 2017-07-31 17:53:19 |
Message-ID: | CAMkU=1xjPzenssKktcX3pqTt9EpJd6nECAHLxRg8_fgGc9VT_Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Mon, Jul 24, 2017 at 3:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wr
>
>
> regression=# EXPLAIN (ANALYZE, BUFFERS) SELECT count(*) FROM ids WHERE id
> IN
> :values_clause;
> QUERY PLAN
> ------------------------------------------------------------
> -----------------------------------------------------------------------
> Aggregate (cost=245006.16..245006.17 rows=1 width=8) (actual
> time=3550.581..3550.581 rows=1 loops=1)
> Execution time: 3550.700 ms
>
>
>
> regression=# set enable_hashagg TO 0;
> regression=# set enable_sort TO 0;
> SET
> regression=# EXPLAIN (ANALYZE, BUFFERS) SELECT count(*) FROM ids WHERE id
> IN
> :values_clause;
> QUERY PLAN
> ------------------------------------------------------------
> -------------------------------------------------------------------
> Aggregate (cost=320003.90..320003.91 rows=1 width=8) (actual
> time=3548.364..3548.364 rows=1 loops=1)
> Execution time: 3548.463 ms
>
>
> At least in this example, the actual runtimes are basically identical
> regardless, so there is no great point in sweating over it.
>
Since The run times are equal, but one is estimated to be 30% more
expensive, I think there is at least some little reason to sweat over it.
Incidentally, I accidentally ran this against a server running with your
patch from
https://www.postgresql.org/message-id/10078.1471955305@sss.pgh.pa.us. On
that server, it did choose the semi-join. But I have no idea why, as it
seems like the effect of that patch would have been to change the distinct
estimate from the magic hard-coded 200, to the natural 200 coming from the
query itself. Why would that affect the cost?
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2017-07-31 18:03:32 | Re: Perfomance of IN-clause with many elements and possible solutions |
Previous Message | Ed Behn | 2017-07-31 17:52:29 | Partitioned TEMP tables |