From: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
---|---|
To: | Jim Nasby <jim(dot)nasby(at)gmail(dot)com> |
Cc: | Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Subject: | Re: Confine vacuum skip logic to lazy_scan_skip |
Date: | 2024-01-12 19:09:23 |
Message-ID: | CAAKRu_aWdt8_tfsrLASd3sr2NP8-MkBBaM_XtegijyDc0R5dRA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jan 12, 2024 at 2:02 PM Jim Nasby <jim(dot)nasby(at)gmail(dot)com> wrote:
>
> On 1/11/24 5:50 PM, Melanie Plageman wrote:
> > On Fri, Jan 5, 2024 at 5:51 AM Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> wrote:
> >>
> >> On Fri, 5 Jan 2024 at 02:25, Jim Nasby <jim(dot)nasby(at)gmail(dot)com> wrote:
> >>>
> >>> On 1/4/24 2:23 PM, Andres Freund wrote:
> >>>
> >>> On 2024-01-02 12:36:18 -0500, Melanie Plageman wrote:
> >>>
> >>> Subject: [PATCH v2 1/6] lazy_scan_skip remove unnecessary local var rel_pages
> >>> Subject: [PATCH v2 2/6] lazy_scan_skip remove unneeded local var
> >>> nskippable_blocks
> >>>
> >>> I think these may lead to worse code - the compiler has to reload
> >>> vacrel->rel_pages/next_unskippable_block for every loop iteration, because it
> >>> can't guarantee that they're not changed within one of the external functions
> >>> called in the loop body.
> >>>
> >>> Admittedly I'm not up to speed on recent vacuum changes, but I have to wonder if the concept of skipping should go away in the context of vector IO? Instead of thinking about "we can skip this range of blocks", why not maintain a list of "here's the next X number of blocks that we need to vacuum"?
> >>
> >> Sorry if I misunderstood. AFAIU, with the help of the vectored IO;
> >> "the next X number of blocks that need to be vacuumed" will be
> >> prefetched by calculating the unskippable blocks ( using the
> >> lazy_scan_skip() function ) and the X will be determined by Postgres
> >> itself. Do you have something different in your mind?
> >
> > I think you are both right. As we gain more control of readahead from
> > within Postgres, we will likely want to revisit this heuristic as it
> > may not serve us anymore. But the streaming read interface/vectored
> > I/O is also not a drop-in replacement for it. To change anything and
> > ensure there is no regression, we will probably have to do
> > cross-platform benchmarking, though.
> >
> > That being said, I would absolutely love to get rid of the skippable
> > ranges because I find them very error-prone and confusing. Hopefully
> > now that the skipping logic is isolated to a single function, it will
> > be easier not to trip over it when working on lazy_scan_heap().
>
> Yeah, arguably it's just a matter of semantics, but IMO it's a lot
> clearer to simply think in terms of "here's the next blocks we know we
> want to vacuum" instead of "we vacuum everything, but sometimes we skip
> some blocks".
Even "we vacuum some stuff, but sometimes we skip some blocks" would
be okay. What we have now is "we vacuum some stuff, but sometimes we
skip some blocks, but only if we would skip enough blocks, and, when
we decide to do that we can't go back and actually get visibility
information for those blocks we skipped because we are too cheap"
- Melanie
From | Date | Subject | |
---|---|---|---|
Next Message | Konstantin Knizhnik | 2024-01-12 19:30:21 | Re: Custom explain options |
Previous Message | Pavel Stehule | 2024-01-12 19:09:14 | Re: plpgsql memory leaks |