Re: Better HINT message for "unexpected data beyond EOF"

From: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Christoph Berg <myon(at)debian(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Better HINT message for "unexpected data beyond EOF"
Date: 2025-04-04 10:55:14
Message-ID: CAKZiRmyZn1JX-joAKhMFQF83yqKsaMx0wNBYU+8ya8dperfkkg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 1, 2025 at 3:59 PM Andres Freund <andres(at)anarazel(dot)de> wrote:

Hi Robert, Andres, Christoph,

> On 2025-04-01 09:49:12 -0400, Robert Haas wrote:
> > On Tue, Apr 1, 2025 at 7:13 AM Jakub Wartak
> > <jakub(dot)wartak(at)enterprisedb(dot)com> wrote:
> > > Thread bump. So we have the following candidates:
> > >
> > > 1. remove it as Andres stated:
> > > ERROR: unexpected data beyond EOF in block 1472 of relation base/5/16387
> > >
> > > 2a. Robert's idea
> > > ERROR: unexpected data beyond EOF in block 1472 of relation base/5/16387
> > > HINT: This has been observed with PostgreSQL files being overwritten.
> > >
> > > 2b. Christoph's idea
> > > ERROR: unexpected data beyond EOF in block 1472 of relation base/5/16387
> > > HINT: Did anything besides PostgreSQL touch that file?
>
> FWIW, I think these are all just about equally wrong.
> 1) doesn't allow the use to understand what could be the culprit

Well, that's pretty easy: tablespace relations were overwritten live
(PITR on the same host, w/o tablespace remapping). This assumes you
know that this restore is happening in the first place.

> 2*) omit that zero_damaged_pages can cause this due to the logic in mdreadv()

Saw 00066aa173 [1], but zero_damaged_pages use is non-existent
(outside of handling corruption cases), right?

> > > Another question is should we back-patch this? I believe we should (?)
> >
> > I don't think this qualifies as a bug. The current wording isn't
> > factually wrong, just unhelpful.

I think it is highly misleading and not up to modern times, it
certainly had value in the past.
I cannot comment from others perspective, but it has sent me in the
past into literally cross-checking if Linux's
lseek() system call vector has not been replaced by some LKMs (in some
cases it was...).

So yes I agree, it would be *better* if it wasn't present in the first
place in modern times.

> > Even if it were wrong, we need a
> > pretty good reason to change message strings in a stable branch,
> > because that can break things for users who are grepping for the
> > current string (or a translation thereof). If an overwhelming
> > consensus in favor of back-patching emerges, fine, but my gut feeling
> > is that back-patching will make more people sad than it makes happy.
>
> I'd certainly not backpatch.

There goes my plan... I would recommend backpatching, but I'm alone
and outvoted by more experienced people :^)

OK, so attached is a small patch to eradicate this HINT: CI tested,
verified using original reproducer, registered in cf app.

-J.

[1] - https://github.com/postgres/postgres/commit/00066aa1733d84109f7569a7202c3915d8289d3a

Attachment Content-Type Size
v1-0001-Remove-HINT-message-for-unexpected-data-beyond-EO.patch application/octet-stream 13.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2025-04-04 10:57:55 Re: rename pg_log_standby_snapshot
Previous Message Amit Kapila 2025-04-04 10:45:23 Re: Improve error reporting for few options in pg_createsubscriber