From: | Hannu Krosing <hannuk(at)google(dot)com> |
---|---|
To: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Let's make PostgreSQL multi-threaded |
Date: | 2023-06-08 12:44:11 |
Message-ID: | CAMT0RQSfoUCNskuweVBhmEiWh76q+eqDhX+5_bWpq8Qq-KuTTg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jun 8, 2023 at 2:15 PM Matthias van de Meent
<boekewurm+postgres(at)gmail(dot)com> wrote:
>
> On Thu, 8 Jun 2023 at 11:54, Hannu Krosing <hannuk(at)google(dot)com> wrote:
> >
> > On Wed, Jun 7, 2023 at 11:37 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > >
> > > Hi,
> > >
> > > On 2023-06-05 13:40:13 -0400, Jonathan S. Katz wrote:
> > > > 2. While I wouldn't want to necessarily discourage a moonshot effort, I
> > > > would ask if developer time could be better spent on tackling some of the
> > > > other problems around vertical scalability? Per some PGCon discussions,
> > > > there's still room for improvement in how PostgreSQL can best utilize
> > > > resources available very large "commodity" machines (a 448-core / 24TB RAM
> > > > instance comes to mind).
> > >
> > > I think we're starting to hit quite a few limits related to the process model,
> > > particularly on bigger machines. The overhead of cross-process context
> > > switches is inherently higher than switching between threads in the same
> > > process - and my suspicion is that that overhead will continue to
> > > increase. Once you have a significant number of connections we end up spending
> > > a *lot* of time in TLB misses, and that's inherent to the process model,
> > > because you can't share the TLB across processes.
> >
> >
> > This part was touched in the "AMA with a Linux Kernale Hacker"
> > Unconference session where he mentioned that the had proposed a
> > 'mshare' syscall for this.
> >
> > So maybe a more fruitful way to fixing the perceived issues with
> > process model is to push for small changes in Linux to overcome these
> > avoiding a wholesale rewrite ?
>
> We support not just Linux, but also Windows and several (?) BSDs. I'm
> not against pushing Linux to make things easier for us, but Linux is
> an open source project, too, where someone need to put in time to get
> the shiny things that you want. And I'd rather see our time spent in
> PostgreSQL, as Linux is only used by a part of our user base.
Do we have any statistics for the distribution of our user base ?
My gut feeling says that for performance-critical use the non-Linux is
in low single digits at best.
My fascination for OpenSource started with realisation that instead of
workarounds you can actually fix the problem at source. So if the
specific problem is that TLB is not shared then the proper fix is
making it shared instead of rewriting everything else to get around
it. None of us is limited to writing code in PostgreSQL only. If the
easiest and more generix fix can be done in Linux then so be it.
It is also possible that Windows and *BSD already have a similar feature.
>
> > > The amount of duplicated code we have to deal with due to to the process model
> > > is quite substantial. We have local memory, statically allocated shared memory
> > > and dynamically allocated shared memory variants for some things. And that's
> > > just going to continue.
> >
> > Maybe we can already remove the distinction between static and dynamic
> > shared memory ?
>
> That sounds like a bad idea, dynamic shared memory is more expensive
> to maintain than our static shared memory systems, not in the least
> because DSM is not guaranteed to share the same addresses in each
> process' address space.
Then this too needs to be fixed
>
> > Though I already heard some complaints at the conference discussions
> > that having the dynamic version available has made some developers
> > sloppy in using it resulting in wastefulness.
>
> Do you know any examples of this wastefulness?
No. Just somebody mentioned it in a hallway conversation and the rest
of the developers present mumbled approvingly :)
> > > > I'm purposely giving a nonanswer on whether it's a worthwhile goal, but
> > > > rather I'd be curious where it could stack up against some other efforts to
> > > > continue to help PostgreSQL improve performance and handle very large
> > > > workloads.
> > >
> > > There's plenty of things we can do before, but in the end I think tackling the
> > > issues you mention and moving to threads are quite tightly linked.
> >
> > Still we should be focusing our attention at solving the issues and
> > not at "moving to threads" and hoping this will fix the issues by
> > itself.
>
> I suspect that it is much easier to solve some of the issues when
> working in a shared address space.
Probably. But it would come at the cost of needing to change a lot of
other parts of PostgreSQL.
I am not against making code cleaner for potential threaded model
support. I am just a bit sceptical about the actual switch being easy,
or doable in the next 10-15 years.
> E.g. resizing shared_buffers is difficult right now due to the use of
> a static allocation of shared memory, but if we had access to a single
> shared address space, it'd be easier to do any cleanup necessary for
> dynamically increasing/decreasing its size.
This again could be done with shared memory mapping + dynamic shared memory.
> Same with parallel workers - if we have a shared address space, the
> workers can pass any sized objects around without being required to
> move the tuples through DSM and waiting for the leader process to
> empty that buffer when it gets full.
Larger shared memory :)
Same for shared plan cache and shared schema cache.
> Sure, most of that is probably possible with DSM as well, it's just
> that I see a lot more issues that you need to take care of when you
> don't have a shared address space (such as the pointer translation we
> do in dsa_get_address).
All of the above seem to point to the need of a single thing - having
an option for shared memory mappings .
So let's focus on fixing things with minimal required change.
And this would not have an adverse affect on systems that can not
share mapping, they just won't become faster. And thay are all welcome
to add the option for shared mappings too if they see enough value in
it.
It could sound like the same thing as threaded model, but should need
much less changes and likely no changes for most out-of-tree
extensions
---
Cheers
Hannu
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2023-06-08 12:55:09 | Parallel CREATE INDEX for BRIN indexes |
Previous Message | Jan Wieck | 2023-06-08 12:43:35 | Re: Named Prepared statement problems and possible solutions |