From: | wenhui qiu <qiuwenhuifx(at)gmail(dot)com> |
---|---|
To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Cc: | Nathan Bossart <nathandbossart(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Peter Geoghegan <pg(at)bowt(dot)ie>, David Rowley <dgrowley(at)gmail(dot)com>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com> |
Subject: | Re: Trigger more frequent autovacuums of heavy insert tables |
Date: | 2025-02-25 03:30:04 |
Message-ID: | CAGjGUAKZaxCVbsmoReoYcKv7pu3wtuGpanyX-hE69v1AO_K0_Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Melanie
> relallvisible. It seems like we should make it consistent. Perhaps we
> should just remove it from heap_vacuum_rel(). Then add an assert in
> all these places to at least protect development mistakes.
I think there's some objection to that.
Thanks
On Tue, Feb 25, 2025 at 7:35 AM Melanie Plageman <melanieplageman(at)gmail(dot)com>
wrote:
> On Mon, Feb 24, 2025 at 4:53 PM Nathan Bossart <nathandbossart(at)gmail(dot)com>
> wrote:
> >
> > On Thu, Feb 20, 2025 at 07:35:32PM -0500, Melanie Plageman wrote:
> > > Attache v7 doesn't cap the result for manual stats updating done with
> > > relation_statistics_update().
> >
> > Unfortunately, this already needs a rebase...
>
> Thanks! Attached v8 does just that.
>
> I've also taken a pass at updating the stats import tests to include
> relallfrozen. I'm not totally sure how I feel about the results. I
> erred on the side of putting relallfrozen wherever relallvisible was.
> But, that means relallfrozen is included in test cases where it
> doesn't seem important to the test case. But, that was true with
> relallvisible too.
>
> Anyway, I talked to Corey off-list last week. Maybe he will have a
> chance to weigh in on the test cases. Also, I had forgotten to include
> relallfrozen in pg_clear_relation_stats(). I've fixed that in v8, but
> now I'm worried there are some other places I may have missed.
>
> > > I did, however, keep the cap for the
> > > places where vacuum/analyze/create index update the stats. There the
> > > number for relallfrozen is coming directly from visibilitymap_count(),
> > > so it should be correct. I could perhaps add an assert instead, but I
> > > didn't think that really made sense. An assert is meant to help the
> > > developer and what could the developer do about the visibility map
> > > being corrupted.
> >
> > This might just be personal preference, but I think this is exactly the
> > sort of thing an assertion is meant for. If we expect the value to
> always
> > be correct, and it's not, then our assumptions were wrong or someone has
> > broken something. In both of these cases, I as a developer would really
> > like to know so that I don't introduce a latent bug. If we want Postgres
> > to gracefully handle or detect visibility map corruption, then maybe we
> > should do both or PANIC.
>
> I'm on the fence about adding a PANIC. We do PANIC in other places
> where we notice corruption (like PageAddItemExtended()). But, in most
> of the cases, it seems like we are PANICing because there isn't a
> reasonable way to accomplish the intended task. In this case, we
> probably can't trust the visibility map counts for that page, but the
> pg_class columns are just estimates, so just capping relallfrozen
> might be good enough.
>
> I will note that in the other place where we may notice corruption in
> the VM, in lazy_scan_prune(), we do visibilitymap_clear() and print a
> WARNING -- as opposed to PANICing. Perhaps that's because there is no
> need to panic, since we are already fixing the problem with
> visibiliytmap_clear().
>
> An assert would only help if the developer did something while
> developing that corrupted the visibility map. It doesn't help keep
> bogus values out of pg_class if a user's visibility map got corrupted
> in some way. But, maybe that isn't needed.
>
> > I do see that heap_vacuum_rel() already caps relallvisible to relpages,
> but
> > it's not clear to me whether that's something that we expect to regularly
> > happen in practice or what the adverse effects might be. So perhaps I'm
> > misunderstanding the scope and severity of bogus results from
> > visibilitymap_count(). Commit e6858e6, which added this code, doesn't
> say
> > anything about safety, but commit 3d351d9 changed the comment in question
> > to its current wording. After a very quick skim of the latter's thread
> > [0], I don't see any discussion about this point, either.
>
> Thanks for doing this archaeology.
>
> I am hesitant to keep the current cap on relallvisible in
> heap_vacuum_rel() but then not include an equivalent cap for
> relallfrozen. And I think it would be even more confusing for
> relallvisible to be capped but relallfrozen has an assert instead.
>
> None of the other locations where relallvisible is updated
> (do_analyze_rel(), index_update_stats()) do this capping of
> relallvisible. It seems like we should make it consistent. Perhaps we
> should just remove it from heap_vacuum_rel(). Then add an assert in
> all these places to at least protect development mistakes.
>
> - Melanie
>
From | Date | Subject | |
---|---|---|---|
Next Message | jian he | 2025-02-25 03:30:12 | Re: Statistics Import and Export |
Previous Message | Jeff Davis | 2025-02-25 02:54:25 | Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators |