Re: a pool for parallel worker

From: Kirill Reshke <reshkekirill(at)gmail(dot)com>
To: Andy Fan <zhihuifan1213(at)163(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: a pool for parallel worker
Date: 2025-03-26 05:42:20
Message-ID: CALdSSPgJMaLeegakjVyWrk7nkqmsbT7hwyZao4FJj9w267GXDA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 11 Mar 2025 at 17:38, Andy Fan <zhihuifan1213(at)163(dot)com> wrote:
>
>
>
> Hi,
>

Hi!

> Currently when a query needs some parallel workers, postmaster spawns
> some backend for this query and when the work is done, the backend
> exit. there are some wastage here, e.g. syscache, relcache, smgr cache,
> vfd cache and fork/exit syscall itself.
>
> I am thinking if we should preallocate (or create lazily) some backends
> as a pool for parallel worker. The benefits includes:
>
> (1) Make the startup cost of a parallel worker lower in fact.
> (2) Make the core most suitable for the cases where executor need to a
> new worker to run a piece of plan more. I think this is needed in some
> data redistribution related executor in a distributed database.
>
> I guess the both cases can share some well designed code, like costing or
> transfer the data between worker and leader.

Surely forking from the postmaster is costly.

> The boring thing for the pool is it is [dbid + userId] based, which
> I mean if the dbid or userId is different with the connection in pool,
> they can't be reused. To reduce the effect of UserId, I think if we can
> start the pool with a superuser and then switch the user information
> with 'SET ROLE xxx'. and the pool can be created lazily.

I don't think this is secure. Currently, if your postgresql process
had started under superuser role, there is no way to undo that.
Consider a worker in a pool running a user query, which uses UDF. In
this UDF, one can simply RESET SESSION AUTHORIZATION and process with
anything under superuser rights.

> Any comments on this idea?
>
> --
> Best Regards
> Andy Fan
>
>
>

--
Best regards,
Kirill Reshke

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2025-03-26 05:43:05 Re: [Patch] remove duplicated smgrclose
Previous Message Amit Kapila 2025-03-26 05:10:05 Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.