Re: WHERE col = ANY($1) extended to 2 or more columns?

From: "Peter J(dot) Holzer" <hjp-pgsql(at)hjp(dot)at>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: WHERE col = ANY($1) extended to 2 or more columns?
Date: 2023-02-11 15:21:49
Message-ID: 20230211152149.ci75aun2m25sb645@hjp.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 2023-02-09 18:35:42 +0100, Dominique Devienne wrote:
> On Thu, Feb 9, 2023 at 5:37 PM David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
> wrote:
>
> On Thu, Feb 9, 2023 at 9:28 AM Alban Hertroys <haramrae(at)gmail(dot)com> wrote:
>
> > On 9 Feb 2023, at 16:41, Dominique Devienne <ddevienne(at)gmail(dot)com>
> wrote:
> > Now we'd like to do the same for composite keys, and I don't know how
> to do that.
>
> This works:
> => select (1, 'one'::text) in ((1, 'two'::text), (2, 'one'::text), (1,
> 'one'::text), (2, 'two'::text));
>
> But you cannot write the right-side of the IN as a single parameter which
> seems to be the primary constraint trying to be conformed to.
>
>
> Right. The goal is to (re)use a prepared statement (i.e. plan once), and bind
> the RHS (binary) array
> and do a single exec (single round-trip) to get the matching rows. AFAIK, this
> is the fastest way.

Planning time is often much less than execution time, so minimizing it
may not give you the fastest results.

For example I tried to fetch two data points from a medium sized (3 mio
rows) table ((macrobondtimeseries, date) is the primary key here):

wds=> explain (analyze) select macrobondtimeseries, date, value
from facttable_oecd_naq_2018
where (macrobondtimeseries, date) = any(
array [
( 'naq_mex_b1_gi_cqr_a' , '2013-01-01' ),
( 'naq_lux_p3s13_gpsa_a' , '1961-01-01' )
]::tsd[]
);
╔════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
║ QUERY PLAN ║
╟────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
║ Gather (cost=1000.00..334945.37 rows=35242 width=34) (actual time=6.194..1618.968 rows=2 loops=1) ║
║ Workers Planned: 2 ║
║ Workers Launched: 2 ║
║ -> Parallel Seq Scan on facttable_oecd_naq_2018 (cost=0.00..330421.17 rows=14684 width=34) (actual time=1054.739..1589.818 rows=1 loops=3) ║
║ Filter: (ROW(macrobondtimeseries, date) = ANY ('{"(naq_mex_b1_gi_cqr_a,2013-01-01)","(naq_lux_p3s13_gpsa_a,1961-01-01)"}'::tsd[])) ║
║ Rows Removed by Filter: 1178191 ║
║ Planning Time: 1.833 ms ║
║ JIT: ║
║ Functions: 12 ║
║ Options: Inlining false, Optimization false, Expressions true, Deforming true ║
║ Timing: Generation 1.026 ms, Inlining 0.000 ms, Optimization 0.948 ms, Emission 12.613 ms, Total 14.586 ms ║
║ Execution Time: 1619.729 ms ║
╚════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
(12 rows)

wds=> explain (analyze) select macrobondtimeseries, date, value
from facttable_oecd_naq_2018
where (macrobondtimeseries, date) in
(
( 'naq_mex_b1_gi_cqr_a' , '2013-01-01' ),
( 'naq_lux_p3s13_gpsa_a' , '1961-01-01' )
);
╔═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
║ QUERY PLAN ║
╟───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
║ Bitmap Heap Scan on facttable_oecd_naq_2018 (cost=4.13..7.17 rows=2 width=34) (actual time=0.259..0.263 rows=2 loops=1) ║
║ Recheck Cond: ((((macrobondtimeseries)::text = 'naq_mex_b1_gi_cqr_a'::text) AND (date = '2013-01-01'::date)) OR (((macrobondtimeseries)::text = 'naq_lux_p3s13_gpsa_a'::text) AND (date = '1961-01-01'::date))) ║
║ Heap Blocks: exact=1 ║
║ -> BitmapOr (cost=4.13..4.13 rows=2 width=0) (actual time=0.184..0.185 rows=0 loops=1) ║
║ -> Bitmap Index Scan on facttable_oecd_naq_2018_pkey (cost=0.00..2.06 rows=1 width=0) (actual time=0.124..0.125 rows=1 loops=1) ║
║ Index Cond: (((macrobondtimeseries)::text = 'naq_mex_b1_gi_cqr_a'::text) AND (date = '2013-01-01'::date)) ║
║ -> Bitmap Index Scan on facttable_oecd_naq_2018_pkey (cost=0.00..2.06 rows=1 width=0) (actual time=0.058..0.058 rows=1 loops=1) ║
║ Index Cond: (((macrobondtimeseries)::text = 'naq_lux_p3s13_gpsa_a'::text) AND (date = '1961-01-01'::date)) ║
║ Planning Time: 1.414 ms ║
║ Execution Time: 0.330 ms ║
╚═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
(10 rows)

The latter is almost 1000 times faster. Saving 1.8 ms on planning time
doesn't help you if you then waste 1.6 s on execution.

So sometimes it pays off to give the planner a little bit of extra
information to work on.

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp(at)hjp(dot)at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Peter J. Holzer 2023-02-11 22:00:31 Re: WHERE col = ANY($1) extended to 2 or more columns?
Previous Message jian he 2023-02-11 14:16:40 Re: How to pass table column values to function