Re: BitmapHeapScan streaming read user and prelim refactoring

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Date: 2025-02-13 12:08:28
Message-ID: eacfc686-bac4-4bdf-b49d-4df76e61016f@vondra.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/13/25 01:40, Melanie Plageman wrote:
> On Sun, Feb 9, 2025 at 9:27 AM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>>
>> For the nvme RAID (device: raid-nvme), it's looks almost exactly the
>> same, except that with parallel query (page 27) there's a clear area of
>> regression with eic=1 (look for "column" of red cells). That's a bit
>> unfortunate, because eic=1 is the default value.
>
> So, I feel pretty confident after even more analysis today with Thomas
> Munro that all of the parallel cases with effective_io_concurrency ==
> 1 are because master cheats effective_io_concurrency. Therefore, I'm
> not too worried about those for now. We probably will have to increase
> effective_io_concurrency before merging this patch, though.
>
> Thomas mentioned this to me off-list, and I think he's right. We
> likely need to rethink the way parallel bitmap heap scan workers get
> block assignments for reading and prefetching to make it more similar
> to parallel sequential scan. The workers should probably get
> assignments of a range of blocks. On master, each worker does end up
> issuing reads/fadvises for a bunch of blocks in a row -- even though
> that isn't the intent of the parallel bitmap table scan
> implementation. We are losing some of that with the patch -- but only
> because it is behaving as you would expect given the implementation
> and design. I don't consider that a blocker, though.
>

Agreed. If this is due to master doing something wrong (and either
prefetching more, or failing to prefetch), that's not the fault of this
patch. People might perceive this as a regression, but increasing the
default eic value would fix that.

BTW I recall we ran into issues with prefetching and parallel workers
about a year ago [1]. Is this the same issue, or something new?

[1]
https://www.postgresql.org/message-id/20240315211449.en2jcmdqxv5o6tlz%40alap3.anarazel.de

>> For the SATA SSD RAID (device: raid-sata), it's similar - a couple
>> regressions for eic=0, just like for NVMe. But then there's also a
>> couple regressions for higher eic values, usually for queries with
>> selective conditions.
>
> I started trying to reproduce the regressions you saw on your SATA
> devices for higher eic values. I was focusing just on serial bitmap
> heap scan -- since that doesn't have the problems mentioned above. I
> don't have a SATA drive, but I used dmsetup to add a 10ms delay to my
> nvme drive. Unfortunately, that did the opposite of reproduce the
> issue. With dm delay 10ms, the patch is 5 times faster than master.
>

I'm not quite sure at which point does dm-setup add the delay. Does it
affect the fadvise call too, or just the I/O if the page is not already
in page cache?

regards

--
Tomas Vondra

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2025-02-13 12:47:01 Re: pg_stat_statements and "IN" conditions
Previous Message Shlok Kyal 2025-02-13 12:05:56 Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.