Re: index prefetching

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: index prefetching
Date: 2024-01-25 16:47:06
Message-ID: 033fd9c8-2efc-4777-bdc6-59c13ec4a64d@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/25/24 11:45, Dilip Kumar wrote:
> On Wed, Jan 24, 2024 at 11:43 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
>> On 1/22/24 07:35, Konstantin Knizhnik wrote:
>>>
>>> On 22/01/2024 1:47 am, Tomas Vondra wrote:
>>>> h, right. Well, you're right in this case we perhaps could set just one
>>>> of those flags, but the "purpose" of the two places is quite different.
>>>>
>>>> The "prefetch" flag is fully controlled by the prefetcher, and it's up
>>>> to it to change it (e.g. I can easily imagine some new logic touching
>>>> setting it to "false" for some reason).
>>>>
>>>> The "data" flag is fully controlled by the custom callbacks, so whatever
>>>> the callback stores, will be there.
>>>>
>>>> I don't think it's worth simplifying this. In particular, I don't think
>>>> the callback can assume it can rely on the "prefetch" flag.
>>>>
>>> Why not to add "all_visible" flag to IndexPrefetchEntry ? If will not
>>> cause any extra space overhead (because of alignment), but allows to
>>> avoid dynamic memory allocation (not sure if it is critical, but nice to
>>> avoid if possible).
>>>
>>
> While reading through the first patch I got some questions, I haven't
> read it complete yet but this is what I got so far.
>
> 1.
> +static bool
> +IndexPrefetchBlockIsSequential(IndexPrefetch *prefetch, BlockNumber block)
> +{
> + int idx;
> ...
> + if (prefetch->blockItems[idx] != (block - i))
> + return false;
> +
> + /* Don't prefetch if the block happens to be the same. */
> + if (prefetch->blockItems[idx] == block)
> + return false;
> + }
> +
> + /* not sequential, not recently prefetched */
> + return true;
> +}
>
> The above function name is BlockIsSequential but at the end, it
> returns true if it is not sequential, seem like a problem?

Actually, I think it's the comment that's wrong - the last return is
reached only for a sequential pattern (and when the block was not
accessed recently).

> Also other 2 checks right above the end of the function are returning
> false if the block is the same or the pattern is sequential I think
> those are wrong too.
>

Hmmm. You're right this is partially wrong. There are two checks:

/*
* For a sequential pattern, blocks "k" step ago needs to have block
* number by "k" smaller compared to the current block.
*/
if (prefetch->blockItems[idx] != (block - i))
return false;

/* Don't prefetch if the block happens to be the same. */
if (prefetch->blockItems[idx] == block)
return false;

The first condition is correct - we want to return "false" when the
pattern is not sequential.

But the second condition is wrong - we want to skip prefetching when the
block was already prefetched recently, so this should return true (which
is a bit misleading, as it seems to imply the pattern is sequential,
when it's not).

However, this is harmless, because we then identify this block as
recently prefetched in the "full" cache check, so we won't prefetch it
anyway. So it's harmless, although a bit more expensive.

There's another inefficiency - we stop looking for the same block once
we find the first block breaking the non-sequential pattern. Imagine a
sequence of blocks 1, 2, 3, 1, 2, 3, ... in which case we never notice
the block was recently prefetched, because we always find the break of
the sequential pattern. But again, it's harmless, thanks to the full
cache of recently prefetched blocks.

> 2.
> I have noticed that the prefetch history is maintained at the backend
> level, but what if multiple backends are trying to fetch the same heap
> blocks maybe scanning the same index, so should that be in some shared
> structure? I haven't thought much deeper about this from the
> implementation POV, but should we think about it, or it doesn't
> matter?

Yes, the cache is at the backend level - it's a known limitation, but I
see it more as a conscious tradeoff.

Firstly, while the LRU cache is at backend level, PrefetchBuffer also
checks shared buffers for each prefetch request. So with sufficiently
large shared buffers we're likely to find it there (and for direct I/O
there won't be any other place to check).

Secondly, the only other place to check is page cache, but there's no
good (sufficiently cheap) way to check that. See the preadv2/nowait
experiment earlier in this thread.

I suppose we could implement a similar LRU cache for shared memory (and
I don't think it'd be very complicated), but I did not plan to do that
in this patch unless absolutely necessary.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sergey Prokhorenko 2024-01-25 17:04:05 Re: UUID v7
Previous Message torikoshia 2024-01-25 16:42:14 Add new error_action COPY ON_ERROR "log"