Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes
Date: 2024-09-04 08:47:11
Message-ID: Ztgejz8q7NBdcReD@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Thu, Aug 29, 2024 at 04:04:05PM +0200, Guillaume Lelarge wrote:
> Hello,
>
> This patch was a bit discussed on [1], and with more details on [2]. It
> introduces four new columns in pg_stat_all_tables:
>
> * parallel_seq_scan
> * last_parallel_seq_scan
> * parallel_idx_scan
> * last_parallel_idx_scan
>
> and two new columns in pg_stat_all_indexes:
>
> * parallel_idx_scan
> * last_parallel_idx_scan
>
> As Benoit said yesterday, the intent is to help administrators evaluate the
> usage of parallel workers in their databases and help configuring
> parallelization usage.

Thanks for the patch. I think that's a good idea to provide more instrumentation
in this area. So, +1 regarding this patch.

A few random comments:

1 ===

+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>parallel_seq_scan</structfield> <type>bigint</type>
+ </para>
+ <para>
+ Number of parallel sequential scans initiated on this table
+ </para></entry>
+ </row>

I wonder if we should not update the seq_scan too to indicate that it
includes the parallel_seq_scan.

Same kind of comment for last_seq_scan, idx_scan and last_idx_scan.

2 ===

@@ -410,6 +410,8 @@ initscan(HeapScanDesc scan, ScanKey key, bool keep_startblock)
*/
if (scan->rs_base.rs_flags & SO_TYPE_SEQSCAN)
pgstat_count_heap_scan(scan->rs_base.rs_rd);
+ if (scan->rs_base.rs_parallel != NULL)
+ pgstat_count_parallel_heap_scan(scan->rs_base.rs_rd);

Indentation seems broken.

Shouldn't the parallel counter relies on the "scan->rs_base.rs_flags & SO_TYPE_SEQSCAN"
test too?

What about to get rid of the pgstat_count_parallel_heap_scan and add an extra
bolean parameter to pgstat_count_heap_scan to indicate if counts.parallelnumscans
should be incremented too?

Something like:

pgstat_count_heap_scan(scan->rs_base.rs_rd, scan->rs_base.rs_parallel != NULL)

3 ===

Same comment for pgstat_count_index_scan (add an extra bolean parameter) and
get rid of pgstat_count_parallel_index_scan()).

I think that 2 === and 3 === would help to avoid missing increments should we
add those call to other places in the future.

4 ===

+ if (lstats->counts.numscans || lstats->counts.parallelnumscans)

Is it possible to have (lstats->counts.parallelnumscans) whithout having
(lstats->counts.numscans) ?

> First time I had to add new columns to a statistics catalog. I'm actually
> not sure that we were right to change pg_proc.dat manually.

I think that's the right way to do.

I don't see a CF entry for this patch. Would you mind creating one so that
we don't lost track of it?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-09-04 08:48:10 Re: list of acknowledgments for PG17
Previous Message Peter Eisentraut 2024-09-04 08:40:06 Re: Virtual generated columns