From: | Guillaume Lelarge <guillaume(at)lelarge(dot)info> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Add parallel columns for pg_stat_statements |
Date: | 2024-10-06 13:32:02 |
Message-ID: | CAECtzeWcwRQ6xoZDC6VgOR1q6K=18ahhjU7vgWOzyy2tFwHHEw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Michael,
Le jeu. 3 oct. 2024 à 09:15, Michael Paquier <michael(at)paquier(dot)xyz> a écrit :
> On Thu, Aug 29, 2024 at 10:08:23PM +0200, Guillaume Lelarge wrote:
> > This patch was a bit discussed on [1], and with more details on [2]. It's
> > based on another patch sent in 2022 (see [3]). It introduces seven new
> > columns in pg_stat_statements:
> >
> > * parallelized_queries_planned, number of times the query has been
> planned
> > to be parallelized,
> > * parallelized_queries_launched, number of times the query has been
> > executed with parallelization,
>
> Comparing the numbers of workers planned and launched with the number
> of times a query has been called and planned should provide a rather
> good equivalent, no? I am not sure that these two are mandatory to
> have.
>
>
I'm not sure I follow. That would mean that every time a query is executed,
it always gets the same amount of workers. Which is not guaranteed to be
true.
I would agree, though, that parallelized_queries_launched is probably not
that interesting. I could get rid of it if you think it should go away.
> * parallelized_workers_planned, number of parallel workers planned for
> > this query,
> > * parallelized_workers_launched, number of parallel workers executed for
> > this query,
>
> Yep. Definitely OK with these two. There is an overlap with what
> Benoit has sent here when it comes to publish this data to the
> executor state:
>
> https://www.postgresql.org/message-id/783bc7f7-659a-42fa-99dd-ee0565644e25@dalibo.com
>
>
Well, I don't see this as an overlap. Rather more information.
> > * parallelized_nodes, number of parallelized nodes,
> > * parallelized_nodes_all_workers, number of parallelized nodes which had
> > all requested workers,
> >
> > * parallelized_nodes_no_worker, number of parallelized nodes which had
> no
> > requested workers.
>
> I can see why you want to register this extra data on a node-basis,
> but how does that help when it comes to tune the parallel GUCs? We
> cannot control them at node level and the launched/planned ratio
> offers an equivalent of that. Not exactly, but that's enough to get a
> picture if there is a draught.
>
>
On this, I would agree with you. They are not that particularly useful to
get better setting for parallel GUCs. I can drop them if you want.
> A test script (test2.sql) is attached. You can execute it with "psql -Xef
> > test2.sql your_database" (your_database should not contain a t1 table as
> it
> > will be dropped and recreated).
>
> Let's add proper regression tests instead, including
> oldextversions.sql as this bumps the version of the module. See for
> example the tests of 6fd5071909a2 that can force workers to spawn
> for BRIN and btree queries, validating some of the stats published
> here.
>
Did this on the v2 version of the patch (attached here).
Thanks for your review. If you want the parallelized_queries_launched
column and the parallelized_nodes_* columns dropped, I can do that on a v3
patch.
Regards.
--
Guillaume.
Attachment | Content-Type | Size |
---|---|---|
v2-0001-Add-parallel-columns-to-pg_stat_statements.patch | text/x-patch | 30.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Dmitry Dolgov | 2024-10-06 15:36:49 | System views for versions reporting |
Previous Message | Andrew Dunstan | 2024-10-06 13:12:59 | Re: Should CSV parsing be stricter about mid-field quotes? |