Re: Why is Postgres only using 8 cores for partitioned count? [Parallel Append]

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
Cc: PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Why is Postgres only using 8 cores for partitioned count? [Parallel Append]
Date: 2021-02-16 03:15:41
Message-ID: CAApHDvqaAjX6rJO_Gh7YW_ubW6LphCkz03fLtQPnzMEEppF1Zg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, 15 Feb 2021 at 10:16, Gavin Flower
<GavinFlower(at)archidevsys(dot)co(dot)nz> wrote:
> Just wondering why there is a hard coded limit.

I don't see where the hardcoded limit is. The total number is limited
to max_parallel_workers_per_gather, but there's nothing hardcoded
about the value of that.

> While I agree it might be good to be able specify the number of workers,
> sure it would be possible to derive a suitable default based on the
> number of effective processors available?

It's a pretty tricky thing to get right. The problem is that
something has to rationalise the use of parallel worker processes.
Does it seem reasonable to you to use the sum of the Append child
parallel workers? If so, I imagine someone else would think that
would be pretty insane. We do have to consider the fact that we're
trying to share those parallel worker processes with other backends
which also might want to get some use out of them.

As for if we need some rel option for partitioned tables. I think
that's also tricky. Sure, we could go and add a "parallel_workers"
relopt to partitioned tables, but then that's going to be applied
regardless of how many partitions survive partition pruning. There
could be as little as 2 subpaths in an Append, or the number could be
in the thousands. I can't imagine anyone really wants the same number
of parallel workers in each of those two cases. So I can understand
why ab7271677 wanted to take into account the number of append
children.

Maybe there's some room for some other relopt that just changes the
behaviour of that code. It does not seem too unreasonable that
someone might like to take the sum of the Append child parallel
workers. That value would still be capped at
max_parallel_workers_per_gather, so it shouldn't ever go too insane
unless someone set that GUC to something insane, which would be their
choice. I'm not too sure which such a relopt would be called.

Additionally, for the case being reported here. Since all Append
children are foreign tables, there is actually some work going on to
make it so workers don't have to sit by and wait until the foreign
server returns the results. I don't think anyone would disagree that
it's pretty poor use of a parallel worker to have it sit there doing
nothing for minutes at a time waiting for a single tuple from a
foreign data wrapper. I'm not sure of the status of that work, but if
you want to learn more about it, please see [1]

David

[1] https://commitfest.postgresql.org/32/2491/

In response to

Browse pgsql-general by date

  From Date Subject
Next Message David Rowley 2021-02-16 03:20:16 Re: Why is Postgres only using 8 cores for partitioned count? [Parallel Append]
Previous Message Abdul Qoyyuum 2021-02-16 03:06:39 Set a specific database to log_statement='ddl' but others to be log_statement='all'