From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Fail hard if xlogreader.c fails on out-of-memory |
Date: | 2023-09-27 01:28:30 |
Message-ID: | 20230927012830.GB364510@rfd.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Sep 27, 2023 at 11:06:37AM +1300, Thomas Munro wrote:
> On Tue, Sep 26, 2023 at 8:38 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> > Thoughts and/or comments are welcome.
>
> I don't have an opinion yet on your other thread about making this
> stuff configurable for replicas, but for the simple crash recovery
> case shown here, hard failure makes sense to me.
> Recycled pages can't fool us into making a huge allocation any more.
> If xl_tot_len implies more than one page but the next page's
> xlp_pageaddr is too low, then either the xl_tot_len you read was
> recycled garbage bits, or it was legitimate but the overwrite of the
> following page didn't make it to disk; either way, we don't have a
> record, so we have an end-of-wal condition. The xlp_rem_len check
> defends against the second page making it to disk while the first one
> still contains recycled garbage where the xl_tot_len should be*.
>
> What Michael wants to do now is remove the 2004-era assumption that
> malloc failure implies bogus data. It must be pretty unlikely in a 64
> bit world with overcommitted virtual memory, but a legitimate
> xl_tot_len can falsely end recovery and lose data, as reported from a
> production case analysed by his colleagues. In other words, we can
> actually distinguish between lack of resources and recycled bogus
> data, so why treat them the same?
Indeed. Hard failure is fine, and ENOMEM=end-of-WAL definitely isn't.
> *A more detailed analysis would talk about sectors (page header is
> atomic)
I think the page header is atomic on POSIX-compliant filesystems but not
atomic on ext4. That doesn't change the conclusion on $SUBJECT.
From | Date | Subject | |
---|---|---|---|
Next Message | jacktby jacktby | 2023-09-27 03:03:40 | Re: Index AmInsert Parameter Confused? |
Previous Message | Peter Smith | 2023-09-27 01:28:19 | Re: Invalidate the subscription worker in cases where a user loses their superuser status |