From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Craig Ringer <craig(at)2ndquadrant(dot)com> |
Cc: | AMatveev(at)bitec(dot)ru, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: One process per session lack of sharing |
Date: | 2016-07-19 17:42:42 |
Message-ID: | CA+TgmoZpmr1yNcd9mSnJMxKhD1Px_ySdbR31Ofjo9Va1DEH+xg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jul 18, 2016 at 8:56 PM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> Since I got started with Pg, I've taken it as given that PostgreSQL Will
> Never Use Threads, Don't Even Talk About It. As taboo as query hints or more
> so. Is this actually a serious option?
I'm sure that depends on who you ask. But for myself, yes, I think
it's a serious option. I think that the sort of minimal conversion
that I discussed above could be done and made stable enough to label
as "we have this experimental option..." by one competent developer in
the course of one release cycle without otherwise unduly disrupting
development. From that base, we could consider patches to optimize
the thread model case, and maybe after gaining some experience and
letting it shake out for a release or two we'd decide that the thread
model is ready to be officially supported. I bet there would be a lot
of interest in the thread model from the user and developer
communities. It would probably be a significant win on Windows -
where I understand that the ratio of process creation cost : thread
creation cost is much worse than it is on Linux - and it would
probably open up numerous possible optimizations for parallel query.
It would probably also have some downsides and likely some horrible
bugs, but that's why you start it out as an experimental feature.
In short, I believe the conventional wisdom on this topic is
misguided. Most of the previous discussions of using threading have
assumed that we'd go through all of the backend-private stuff and make
it thread-safe. That's a bad plan, first because it's not very
well-defined, second because it could slow down parts of the system
that rely on the absence of synchronization primitives in certain code
paths, and third because it requires a single mammoth act of
development on a scale that would be extremely hard to make
successful. However, the method that I'm proposing is a completely
different kettle of fish. It is not, as Tom points out, entirely
without danger, but it allows the first patch to be a mostly
mechanical transformation and then allows incremental development on
top of that framework. For that reason, I believe it's at least an
order of magnitude less impractical than the "go through and make
everything thread-safe" approach. Unlike that approach, it also makes
it realistically possible to support both models.
None of that means that it will necessarily work out, but I'm bullish.
Even though parallel query has already had more bug reports than I
would have liked, it's good evidence that you can make architectural
changes that touch almost every part of the system without necessarily
breaking everything. You just have to go slow, be methodical, and
budget time to fix the bugs.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2016-07-19 18:19:51 | Re: One process per session lack of sharing |
Previous Message | Tom Lane | 2016-07-19 16:45:54 | Re: Updating our timezone code in the back branches |