Re: Let's make PostgreSQL multi-threaded

From: Stephan Doliov <stephan(dot)doliov(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Hannu Krosing <hannuk(at)google(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Let's make PostgreSQL multi-threaded
Date: 2023-06-08 23:35:59
Message-ID: CAFOdmV8_7BdvcdJC3EsiB7ayR0gQFOjtmUmOPTBJ_Y3qCdYN6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This is an interesting message thread. I think in regards to the OP's call
to make PG multi-threaded, there should be a clear and identifiable
performance target and use cases for the target. How much performance boost
can be expected, and if so, in which data application context? Will queries
return faster for transactional use cases? analytic use cases? How much
data needs to be stored before one can observe the difference, or better
yet, a difference with a measurable impact on reduced cloud compute costs
as a % of compute cloud costs. I think if you can demonstrate for different
test datasets what those savings amount to you can either find momentum to
pursue it. Beyond that, even with better modern tooling for multi-threaded
development, it's obviously a big lift (may well be worth it!). Some of us
cagey old cats on this list (at least me) still have some work to do to
shed the baggage that previous pain of MT dev has caused us. :-)

Cheers,
Steve

On Thu, Jun 8, 2023 at 1:26 PM Andres Freund <andres(at)anarazel(dot)de> wrote:

> Hi,
>
> On 2023-06-09 07:34:49 +1200, Thomas Munro wrote:
> > I wasn't in Mathew Wilcox's unconference in Ottawa but I found an
> > older article on LWN:
> >
> > https://lwn.net/Articles/895217/
> >
> > For what it's worth, FreeBSD hackers have studied this topic too (and
> > it's been done in Android and no doubt other systems before):
> >
> > https://www.cs.rochester.edu/u/sandhya/papers/ispass19.pdf
> >
> > I've shared that paper on this list before in the context of
> > super/huge pages and their benefits (to executable code, and to the
> > buffer pool), but a second topic in that paper is the idea of a shared
> > page table: "We find that sharing PTPs across different processes can
> > reduce execution cycles by as much as 6.9%. Moreover, the combined
> > effects of using superpages to map the main executable and sharing
> > PTPs for the small shared libraries can reduce execution cycles up to
> > 18.2%." And that's just part of it, because those guys are more
> > interested in shared code/libraries and such so that's probably not
> > even getting to the stuff like buffer pool and DSMs that we might tend
> > to think of first.
>
> I've experimented with using huge pages for executable code on linux, and
> the
> benefits are quite noticable:
>
> https://www.postgresql.org/message-id/20221104212126.qfh3yzi7luvyy5d6%40awork3.anarazel.de
>
> I'm a bit dubious that sharing the page table for executable code increase
> the
> benefit that much further in real workloads. I suspect the reason it was
> different for the authors of the paper is:
>
> > A fixed number of back-to-back
> > transactions are performed on a 5GB database, and we use the
> > -C option of pgbench to toggle between reconnecting after
> > each transaction (reconnect mode) and using one persistent
> > connection per client (persistent connection mode). We use
> > the reconnect mode by default unless stated otherwise.
>
> Using -C explains why you'd see a lot of benefit from sharing page tables
> for
> executable code. But I don't think -C is a particularly interesting
> workload
> to optimize for.
>
>
> > I'm no expert in this stuff, but it seems to be that with shared page
> > table schemes you can avoid wasting huge amounts of RAM on duplicated
> > page table entries (pages * processes), and with huge/super pages you
> > can reduce the number of pages, but AFAIK you still can't escape the
> > TLB shootdown cost, which is all-or-nothing (PCID level at best).
>
> Pretty much that. While you can avoid some TLB shootdowns via PCIDs, that
> only
> avoids flushing the TLB, it doesn't help with the TLB hit rate being much
> lower due to the number of "redundant" mappings with different PCIDs.
>
>
> > The only way to avoid TLB shootdowns on context switches is to have
> *exactly
> > the same memory map*. Or, as Robert succinctly shouted, "THREADS".
>
> +1
>
> Greetings,
>
> Andres Freund
>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2023-06-08 23:38:13 Re: index prefetching
Previous Message Michael Paquier 2023-06-08 23:15:30 Re: Introduce WAIT_EVENT_EXTENSION and WAIT_EVENT_BUFFER_PIN