From: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(at)vondra(dot)me> |
Cc: | Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: BitmapHeapScan streaming read user and prelim refactoring |
Date: | 2025-02-13 14:35:44 |
Message-ID: | CAAKRu_ajp07hn+nerVQKxnM7cVAj+ZYzoGQdBiFLZtX9w6NgaA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Feb 13, 2025 at 7:08 AM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>
>
>
> On 2/13/25 01:40, Melanie Plageman wrote:
> > On Sun, Feb 9, 2025 at 9:27 AM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
> >>
> >> For the nvme RAID (device: raid-nvme), it's looks almost exactly the
> >> same, except that with parallel query (page 27) there's a clear area of
> >> regression with eic=1 (look for "column" of red cells). That's a bit
> >> unfortunate, because eic=1 is the default value.
> >
> > So, I feel pretty confident after even more analysis today with Thomas
> > Munro that all of the parallel cases with effective_io_concurrency ==
> > 1 are because master cheats effective_io_concurrency. Therefore, I'm
> > not too worried about those for now. We probably will have to increase
> > effective_io_concurrency before merging this patch, though.
> >
> > Thomas mentioned this to me off-list, and I think he's right. We
> > likely need to rethink the way parallel bitmap heap scan workers get
> > block assignments for reading and prefetching to make it more similar
> > to parallel sequential scan. The workers should probably get
> > assignments of a range of blocks. On master, each worker does end up
> > issuing reads/fadvises for a bunch of blocks in a row -- even though
> > that isn't the intent of the parallel bitmap table scan
> > implementation. We are losing some of that with the patch -- but only
> > because it is behaving as you would expect given the implementation
> > and design. I don't consider that a blocker, though.
> >
>
> Agreed. If this is due to master doing something wrong (and either
> prefetching more, or failing to prefetch), that's not the fault of this
> patch. People might perceive this as a regression, but increasing the
> default eic value would fix that.
>
> BTW I recall we ran into issues with prefetching and parallel workers
> about a year ago [1]. Is this the same issue, or something new?
>
> [1]
> https://www.postgresql.org/message-id/20240315211449.en2jcmdqxv5o6tlz%40alap3.anarazel.de
Th postfetching is the same issue as you linked in [1].
> >> For the SATA SSD RAID (device: raid-sata), it's similar - a couple
> >> regressions for eic=0, just like for NVMe. But then there's also a
> >> couple regressions for higher eic values, usually for queries with
> >> selective conditions.
> >
> > I started trying to reproduce the regressions you saw on your SATA
> > devices for higher eic values. I was focusing just on serial bitmap
> > heap scan -- since that doesn't have the problems mentioned above. I
> > don't have a SATA drive, but I used dmsetup to add a 10ms delay to my
> > nvme drive. Unfortunately, that did the opposite of reproduce the
> > issue. With dm delay 10ms, the patch is 5 times faster than master.
> >
>
> I'm not quite sure at which point does dm-setup add the delay. Does it
> affect the fadvise call too, or just the I/O if the page is not already
> in page cache?
So dmsetup adds the delay only when the page is not found in kernel
buffer cache -- actually below the filesystem before issuing it to the
block device. reads, writes, and flushes can be delayed (I delayed all
three by 10ms). fadvises can't since they are above that level.
- Melanie
From | Date | Subject | |
---|---|---|---|
Next Message | vignesh C | 2025-02-13 14:41:48 | Re: Restrict publishing of partitioned table with a foreign table as partition |
Previous Message | Ranier Vilela | 2025-02-13 14:28:20 | Simplify the logic a bit (src/bin/scripts/reindexdb.c) |