From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> |
Subject: | Re: BitmapHeapScan streaming read user and prelim refactoring |
Date: | 2024-03-01 17:08:03 |
Message-ID: | 410c42e0-f17b-447f-aa48-fba052d5c080@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 3/1/24 17:51, Melanie Plageman wrote:
> On Fri, Mar 1, 2024 at 9:05 AM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>> On 3/1/24 02:18, Melanie Plageman wrote:
>>> On Thu, Feb 29, 2024 at 6:44 PM Tomas Vondra
>>> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>>>
>>>> On 2/29/24 23:44, Tomas Vondra wrote:
>>>> 1) On master there's clear difference between eic=0 and eic=1 cases, but
>>>> on the patched build there's literally no difference - for example the
>>>> "uniform" distribution is clearly not great for prefetching, but eic=0
>>>> regresses to eic=1 poor behavior).
>>>
>>> Yes, so eic=0 and eic=1 are identical with the streaming read API.
>>> That is, eic 0 does not disable prefetching. Thomas is going to update
>>> the streaming read API to avoid issuing an fadvise for the last block
>>> in a range before issuing a read -- which would mean no prefetching
>>> with eic 0 and eic 1. Not doing prefetching with eic 1 actually seems
>>> like the right behavior -- which would be different than what master
>>> is doing, right?
>>
>> I don't think we should stop doing prefetching for eic=1, or at least
>> not based just on these charts. I suspect these "uniform" charts are not
>> a great example for the prefetching, because it's about distribution of
>> individual rows, and even a small fraction of rows may match most of the
>> pages. It's great for finding strange behaviors / corner cases, but
>> probably not a sufficient reason to change the default.
>
> Yes, I would like to see results from a data set where selectivity is
> more correlated to pages/heap fetches. But, I'm not sure I see how
> that is related to prefetching when eic = 1.
>
OK, I'll make that happen.
>> I think it makes sense to issue a prefetch one page ahead, before
>> reading/processing the preceding one, and it's fairly conservative
>> setting, and I assume the default was chosen for a reason / after
>> discussion.
>
> Yes, I suppose the overhead of an fadvise does not compare to the IO
> latency of synchronously reading that block. Actually, I bet the
> regression I saw by accidentally moving BitmapAdjustPrefetchIterator()
> after table_scan_bitmap_next_block() would be similar to the
> regression introduced by making eic = 1 not prefetch.
>
> When you think about IO concurrency = 1, it doesn't imply prefetching
> to me. But, I think we want to do the right thing and have parity with
> master.
>
Just to be sure we're on the same page regarding what eic=1 means,
consider a simple sequence of pages: A, B, C, D, E, ...
With the current "master" code, eic=1 means we'll issue a prefetch for B
and then read+process A. And then issue prefetch for C and read+process
B, and so on. It's always one page ahead.
Yes, if the page is already in memory, the fadvise is just overhead. It
may happen for various reasons (say, read-ahead). But it's just this one
case, I'd bet in other cases eic=1 would be a win.
>> My suggestion would be to keep the master behavior unless not practical,
>> and then maybe discuss changing the details later. The patch is already
>> complicated enough, better to leave that discussion for later.
>
> Agreed. Speaking of which, we need to add back use of tablespace IO
> concurrency for the streaming read API (which is used by
> BitmapHeapScan in master).
>
+1
>>> With very low selectivity, you are less likely to get readahead
>>> (right?) and similarly less likely to be able to build up > 8kB IOs --
>>> which is one of the main value propositions of the streaming read
>>> code. I imagine that this larger read benefit is part of why the
>>> performance is better at higher selectivities with the patch. This
>>> might be a silly experiment, but we could try decreasing
>>> MAX_BUFFERS_PER_TRANSFER on the patched version and see if the
>>> performance gains go away.
>>
>> Sure, I can do that. Do you have any particular suggestion what value to
>> use for MAX_BUFFERS_PER_TRANSFER?
>
> I think setting it to 1 would be the same as always master -- doing
> only 8kB reads. The only thing about that is that I imagine the other
> streaming read code has some overhead which might end up being a
> regression on balance even with the prefetching if we aren't actually
> using the ranges/vectored capabilities of the streaming read
> interface. Maybe if you just run it for one of the very obvious
> performance improvement cases? I can also try this locally.
>
OK, I'll try with 1, and then we can adjust.
>> I'll also try to add a better version of uniform, where the selectivity
>> matches more closely to pages, not rows.
>
> This would be great.
>
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2024-03-01 17:13:57 | Re: Statistics Import and Export |
Previous Message | Melanie Plageman | 2024-03-01 16:51:44 | Re: BitmapHeapScan streaming read user and prelim refactoring |