Re: Add parallel columns for pg_stat_statements

From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add parallel columns for pg_stat_statements
Date: 2024-10-07 08:00:13
Message-ID: CAECtzeUiv-ZY1yJ2Mva_chgaFqkT9F43cutE3A8xratXErc2-Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le lun. 7 oct. 2024 à 02:18, Michael Paquier <michael(at)paquier(dot)xyz> a écrit :

> On Sun, Oct 06, 2024 at 03:32:02PM +0200, Guillaume Lelarge wrote:
> > I'm not sure I follow. That would mean that every time a query is
> executed,
> > it always gets the same amount of workers. Which is not guaranteed to be
> > true.
> >
> > I would agree, though, that parallelized_queries_launched is probably not
> > that interesting. I could get rid of it if you think it should go away.
>
> My point is that these stats are useful to know which action may have
> to be taken when reaching a mean, and numbers in pg_stat_statements
> offer hints that something is going wrong and that a closer lookup at
> an EXPLAIN plan may be required, particularly if the total number of
> workers planned and launched aggregated in the counters is unbalanced
> across queries. If the planned/launched ratio is balanced across most
> queries queries, a GUC adjustment may be OK. If the ratio is very
> unbalanced in a lower set of queries, I'd also look at tweaking GUCs
> instead like the per_gather. These counters give information that one
> or the other may be required.
>
> > Well, I don't see this as an overlap. Rather more information.
>
> Later versions of Benoit's patch have been accumulating this data in
> the executor state. v4 posted at [1] has the following diffs:
> --- a/src/include/nodes/execnodes.h
> +++ b/src/include/nodes/execnodes.h
> @@ -724,6 +724,9 @@ typedef struct EState
> */
> List *es_insert_pending_result_relations;
> List *es_insert_pending_modifytables;
> +
> + int es_workers_launched;
> + int es_workers_planned;
> } EState;
>
> Your v2 posted on this thread has that:
> @@ -707,6 +707,12 @@ typedef struct EState
> struct EPQState *es_epq_active;
>
> bool es_use_parallel_mode; /* can we use parallel
> workers? */
> + bool es_used_parallel_mode; /* was executed in
> parallel */
> + int es_parallelized_workers_launched;
> + int es_parallelized_workers_planned;
> + int es_parallelized_nodes; /* # of
> parallelized nodes */
> + int es_parallelized_nodes_all_workers; /* # of
> nodes with all workers launched */
> + int es_parallelized_nodes_no_worker; /* # of
> nodes with no workers launched */
>
> es_parallelized_workers_launched and es_workers_launched are the same
> thing in both.
>
>
My bad. I agree this is the way to go. See patch v3-0001 attached.

> > On this, I would agree with you. They are not that particularly useful to
> > get better setting for parallel GUCs. I can drop them if you want.
>
> Yep. I would remove them for now. This leads to more bloat.
>
>
Done. See patch v3-0002 attached.

> > Did this on the v2 version of the patch (attached here).
> >
> > Thanks for your review. If you want the parallelized_queries_launched
> > column and the parallelized_nodes_* columns dropped, I can do that on a
> v3
> > patch.
>
> I'd recommend to split that into more independent patches:
> - Introduce the two counters in EState with the incrementations done
> in nodeGatherMerge.c and nodeGather.c (mentioned that at [2], you may
> want to coordinate with Benoit to avoid duplicating the work).
> - Expand pg_stat_statements to use them for DMLs, SELECTs, well where
> they matter.
> - Look at expanding that for utilities that can do parallel jobs:
> CREATE INDEX and VACUUM, but this has lower priority to me, and this
> can reuse the same counters as the ones added by patch 2.
>
>
The first two are done. The last one is beyond my scope.

I'm now working on Benoit's patch to make it work with my v3-0001 patch.
I'll send the resulting patch on his thread.

> [1]:
> https://www.postgresql.org/message-id/6ecad3ad-835c-486c-9ebd-da87a9a97634@dalibo.com
> [2]: https://www.postgresql.org/message-id/Zv46wTMjLTuu2t9J@paquier.xyz
> --
> Michael
>

Regards.

--
Guillaume.

Attachment Content-Type Size
v3-0001-Introduce-two-new-counters-in-EState.patch text/x-patch 2.7 KB
v3-0002-Add-parallel-columns-to-pg_stat_statements.patch text/x-patch 25.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Guillaume Lelarge 2024-10-07 08:19:04 Re: Parallel workers stats in pg_stat_database
Previous Message Michael Paquier 2024-10-07 07:47:12 Commit fest 2024-09