Quick Links

a pool for parallel worker

From:	Andy Fan <zhihuifan1213(at)163(dot)com>
To:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	a pool for parallel worker
Date:	2025-03-11 12:38:38
Message-ID:	87h63zg2sx.fsf@163.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

Currently when a query needs some parallel workers, postmaster spawns
some backend for this query and when the work is done, the backend
exit. there are some wastage here, e.g. syscache, relcache, smgr cache,
vfd cache and fork/exit syscall itself.

I am thinking if we should preallocate (or create lazily) some backends
as a pool for parallel worker. The benefits includes:

(1) Make the startup cost of a parallel worker lower in fact.
(2) Make the core most suitable for the cases where executor need to a
new worker to run a piece of plan more. I think this is needed in some
data redistribution related executor in a distributed database.

I guess the both cases can share some well designed code, like costing or
transfer the data between worker and leader.

The boring thing for the pool is it is [dbid + userId] based, which
I mean if the dbid or userId is different with the connection in pool,
they can't be reused. To reduce the effect of UserId, I think if we can
start the pool with a superuser and then switch the user information
with 'SET ROLE xxx'. and the pool can be created lazily.

Any comments on this idea?

--
Best Regards
Andy Fan

Responses

Re: a pool for parallel worker at 2025-03-25 02:23:26 from James Hunter
Re: a pool for parallel worker at 2025-03-26 05:42:20 from Kirill Reshke

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Amit Kapila	2025-03-11 12:51:07	Re: Parallel heap vacuum
Previous Message	BharatDB	2025-03-11 12:37:19	Fwd: Test mail for pgsql-hackers