Re: pg_stat_statements and "IN" conditions

From: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, Sami Imseih <samimseih(at)gmail(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Sergei Kornilov <sk(at)zsrv(dot)org>, yasuo(dot)honda(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, smithpb2250(at)gmail(dot)com, vignesh21(at)gmail(dot)com, michael(at)paquier(dot)xyz, nathandbossart(at)gmail(dot)com, stark(dot)cfm(at)gmail(dot)com, geidav(dot)pg(at)gmail(dot)com, marcos(at)f10(dot)com(dot)br, robertmhaas(at)gmail(dot)com, david(at)pgmasters(dot)net, pgsql-hackers(at)postgresql(dot)org, pavel(dot)trukhanov(at)gmail(dot)com, Sutou Kouhei <kou(at)clear-code(dot)com>
Subject: Re: pg_stat_statements and "IN" conditions
Date: 2025-03-17 19:05:42
Message-ID: 202503171905.usybbg5hfoxp@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2025-Feb-14, Dmitry Dolgov wrote:

> This should do it. The last patch for today, otherwise I'll probably add
> more bugs than features :)

Thank you. I've spent some time with this patch in the last few days,
and I propose a few changes. I renamed everything from "merge" to
"squash"; apart from that, it's mostly docs and code comments changes,
but I also removed the addition of a boolean argument to JUMBLE_LOCATION
which AFAICS is unnecessary, and did away with the business of checking
for function immutability. I also changed the code layout of
generate_normalized_query(); no functional changes, I just reordered the
code blocks (which caused a couple of lines to appear repeated that
weren't before).

You can see my patch on top of yours here:
https://github.com/alvherre/postgres/commits/query_id_squash_values/
and the CI run here:
https://cirrus-ci.com/build/5660053472018432

In addition, here I attach the complete patch on top of current master.
Unless there's some opposition to this, I intend to push this tomorrow.

I have to admit that looking at this part of the test,

+SELECT * FROM test_squash_cast WHERE data IN
+ (1::int4::casttesttype, 2::int4::casttesttype, 3::int4::casttesttype,
+ 4::int4::casttesttype, 5::int4::casttesttype, 6::int4::casttesttype,
+ 7::int4::casttesttype, 8::int4::casttesttype, 9::int4::casttesttype,
+ 10::int4::casttesttype, 11::int4::casttesttype);
+ id | data
+----+------
+(0 rows)
+
+SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
+ query | calls
+----------------------------------------------------+-------
+ SELECT * FROM test_squash_cast WHERE data IN +| 1
+ ($1 /*, ... */::int4::casttesttype) |
+ SELECT pg_stat_statements_reset() IS NOT NULL AS t | 1
+(2 rows)

and

+SELECT * FROM test_squash WHERE id IN (1::oid, 2::oid, 3::oid, 4::oid, 5::oid, 6::oid, 7::oid, 8::oid, 9::oid);
+ id | data
+----+------
+(0 rows)
+
+SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
+ query | calls
+------------------------------------------------------------+-------
+ SELECT * FROM test_squash WHERE id IN ($1 /*, ... */::oid) | 1

I am tempted to say that explicit casts should also be considered
squashable (that is, in IsSquashableConst() also allow the case of
func->funcformat == COERCE_EXPLICIT_CAST). That would also squash
queries such as this one:

+SELECT * FROM test_squash_bigint WHERE data IN
+ (1::bigint, 2::bigint, 3::bigint, 4::bigint, 5::bigint, 6::bigint,
+ 7::bigint, 8::bigint, 9::bigint, 10::bigint, 11::bigint);
+ id | data
+----+------
+(0 rows)
+
+SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
+ query | calls
+----------------------------------------------------------------------------------+-------
+ SELECT * FROM test_squash_bigint WHERE data IN +| 1
+ ($1::bigint, $2::bigint, $3::bigint, $4::bigint, $5::bigint, $6::bigint,+|
+ $7::bigint, $8::bigint, $9::bigint, $10::bigint, $11::bigint) |

I, frankly, see little argument for making a distinction here. We can
still discuss whether we prefer it one way or the other; we don't need
that decision to prevent me from pushing the patch I here attach, I think.

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"The saddest aspect of life right now is that science gathers knowledge faster
than society gathers wisdom." (Isaac Asimov)

Attachment Content-Type Size
v29-0001-Prevent-jumbling-of-every-element-in-ArrayExpr.patch text/x-diff 45.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2025-03-17 19:14:16 Re: pg_stat_statements and "IN" conditions
Previous Message Andres Freund 2025-03-17 18:54:59 Re: BitmapHeapScan streaming read user and prelim refactoring