From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Subject: | Re: BAS_BULKREAD vs read stream |
Date: | 2025-04-08 02:20:38 |
Message-ID: | hdcz65oehetyjygraktst2xb77qnyiig5mxhi46yhvfmzdgnqo@tarjv3ofqoe4 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2025-04-07 16:28:20 -0400, Andres Freund wrote:
> On 2025-04-07 15:24:43 -0400, Melanie Plageman wrote:
> > On Sun, Apr 6, 2025 at 4:15 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > >
> > > I think we should consider increasing BAS_BULKREAD TO something like
> > > Min(256, io_combine_limit * (effective_io_concurrency + 1))
> >
> > Do you mean Max? If so, this basically makes sense to me.
>
> Err, yes.
In the attached I implemented the above idea, with some small additional
refinements:
- To allow sync seqscans to work at all, we should only *add* to the 256kB
that we currently have - otherwise all buffers in a ring will be undergoing
IO, never allowing two synchronizing scans to actually use the buffers that
the other scan has already read in.
This also kind of obsoletes the + 1 in the formula above, although that is
arguable, particularly for effective_io_concurrency=0.
- If the backend has a PinLimit() that won't allow io_combine_limit *
effective_io_concurrency buffers to undergo IO, it doesn't make sense to
make the ring bigger. At best it would waste space for the ring, at worst
it'd make "ring escapes" inevitable - victim buffer search would always
replace buffers that we have in the ring.
- the multiplication by (BLCKSZ / 1024) that I omitted above is actually
included :)
I unfortunately think we do need *something* to address $subject for 18 - the
performance regression when increasing relation sizes is otherwise just too
big - it's trivial to find queries getting slower by more than 4x. On local,
low-latency NVMe storage - on network storage the regression will often be
bigger.
If we don't do something for 18, only consolation would be that the
performance when using the 256kB BAS_BULKREAD is rather close to the
performance one gets in 17, with or without without a strategy. But I don't
think that would make it less surprising that once your table grows sufficient
to use a strategy your IO throughput craters.
I've some local prototype for the 17/18 "strategy escape" issue, will work on
polishing that soon, unless you have something for that Thomas?
Greetings,
Andres Freund
Attachment | Content-Type | Size |
---|---|---|
v2-0001-Increase-BAS_BULKREAD-based-on-effective_io_concu.patch | text/x-diff | 3.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2025-04-08 02:33:45 | Re: BitmapHeapScan streaming read user and prelim refactoring |
Previous Message | Jacob Champion | 2025-04-08 02:10:10 | Re: [PoC] Federated Authn/z with OAUTHBEARER |