From: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
---|---|
To: | Hannu Krosing <hannuk(at)google(dot)com> |
Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Let's make PostgreSQL multi-threaded |
Date: | 2023-06-12 04:01:17 |
Message-ID: | CAFiTN-vJqo4TSBpkQTJqhYz6CL0M=cPhQZUXnop1uDC47s2hBg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Jun 10, 2023 at 11:32 PM Hannu Krosing <hannuk(at)google(dot)com> wrote:
>
> On Mon, Jun 5, 2023 at 4:52 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> >
> > If there are no major objections, I'm going to update the developer FAQ,
> > removing the excuses there for why we don't use threads [1].
>
> I think it is not wise to start the wholesale removal of the objections there.
>
> But I think it is worthwhile to revisit the section about threads and
> maybe split out the historic part which is no more true, and provide
> both pros and cons for these.
>
> I started with this short summary from the discussion in this thread,
> feel free to expand, argue, fix :)
> * is current excuse
> -- is counterargument or ack
> ----------------
> As an example, threads are not yet used instead of multiple processes
> for backends because:
> * Historically, threads were poorly supported and buggy.
> -- yes they were, not relevant now when threads are well-supported and non-buggy
>
> * An error in one backend can corrupt other backends if they're
> threads within a single process
> -- still valid for silent corruption
> -- for detected crash - yes, but we are restarting all backends in
> case of crash anyway.
>
> * Speed improvements using threads are small compared to the remaining
> backend startup time.
> -- we now have some measurements that show significant performance
> improvements not related to startup time
>
> * The backend code would be more complex.
> -- this is still the case
> -- even more worrisome is that all extensions also need to be rewritten
> -- and many incompatibilities will be silent and take potentially years to find
>
> * Terminating backend processes allows the OS to cleanly and quickly
> free all resources, protecting against memory and file descriptor
> leaks and making backend shutdown cheaper and faster
> -- still true
>
> * Debugging threaded programs is much harder than debugging worker
> processes, and core dumps are much less useful
> -- this was countered by claiming that
> -- by now we have reasonable debugger support for threads
> -- there is no direct debugger support for debugging the exact
> system set up like PostgreSQL processes + shared memory
>
> * Sharing of read-only executable mappings and the use of
> shared_buffers means processes, like threads, are very memory
> efficient
> -- this seems to say that the current process model is as good as threads ?
> -- there were a few counterarguments
> -- per-backend virtual memory mapping can add up to significant
> amount of extra RAM usage
> -- the discussion did not yet touch various per-backend caches
> (pg_catalog cache, statement cache) which are arguably easier to
> implement in threaded model
> -- TLB reload at each process switch is expensive and would be
> mostly avoided in case of threads
I think it is worth mentioning that parallel worker infrastructure
will be simplified with threaded models e.g. 'parallel query', and
'parallel vacuum'.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2023-06-12 04:06:13 | Re: Remove WindowClause PARTITION BY items belonging to redundant pathkeys |
Previous Message | Tom Lane | 2023-06-12 03:30:52 | Re: Wrong results from Parallel Hash Full Join |