Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes
Date: 2024-11-11 16:06:43
Message-ID: CA+TgmoZ5oiVFF1=AqmESrJo4YPbk9WO8n+NnLPCm6wNrnoEooQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Nov 10, 2024 at 9:05 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> As hinted on other related threads like around [1], I am so-so about
> the proposal of these numbers at table and index level now that we
> have e7a9496de906 and 5d4298e75f25.

I think the question to which we don't have a clear answer is: for
what purpose would you want to be able to distinguish parallel and
non-parallel scans on a per-table basis?

I think it's fairly clear why the existing counters exist at a table
level. If an index isn't getting very many index scans, perhaps it's
useless -- or worse than useless if it interferes with HOT -- and
should be dropped. On the other hand if a table is getting a lot of
sequential scans even though it happens to be quite large, perhaps new
indexes are needed. Or if the indexes that we expect to get used are
not the same as those actually getting used, perhaps we want to add or
drop indexes or adjust the queries.

But it is unclear to me what sort of tuning we would do based on
knowing how many of the scans on a certain table or a certain index
were parallel vs non-parallel. I have not fully reviewed the threads
linked in the original post; but I did look at them briefly and did
not immediately see discussion of the specific counters proposed here.
I also don't see anything in this thread that clearly explains why we
should want this exact thing. I don't want to make it sound like I
know that this is useless; I'm sure that Guillaume probably has lots
of hands-on tuning experience with this stuff that I lack. But the
reasons aren't clearly spelled out as far as I can see, and I'm having
some trouble imagining what they are.

Compare the parallel worker draught stuff. It's really clear how that
is intended to be used. If we're routinely failing to launch workers,
then either max_parallel_workers_per_gather is too high or
max_parallel_workers is too low. Now, I will admit that I have a few
doubts about whether that feature will get much real-world use but it
seems hard to doubt that it HAS a use. In this case, that seems less
clear.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-11-11 16:30:18 Re: Eager aggregation, take 3
Previous Message Dave Page 2024-11-11 16:03:17 Re: PG17 failing tests (DST related?)