From: | Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Christoph Berg <myon(at)debian(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Better HINT message for "unexpected data beyond EOF" |
Date: | 2025-04-04 10:55:14 |
Message-ID: | CAKZiRmyZn1JX-joAKhMFQF83yqKsaMx0wNBYU+8ya8dperfkkg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Apr 1, 2025 at 3:59 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
Hi Robert, Andres, Christoph,
> On 2025-04-01 09:49:12 -0400, Robert Haas wrote:
> > On Tue, Apr 1, 2025 at 7:13 AM Jakub Wartak
> > <jakub(dot)wartak(at)enterprisedb(dot)com> wrote:
> > > Thread bump. So we have the following candidates:
> > >
> > > 1. remove it as Andres stated:
> > > ERROR: unexpected data beyond EOF in block 1472 of relation base/5/16387
> > >
> > > 2a. Robert's idea
> > > ERROR: unexpected data beyond EOF in block 1472 of relation base/5/16387
> > > HINT: This has been observed with PostgreSQL files being overwritten.
> > >
> > > 2b. Christoph's idea
> > > ERROR: unexpected data beyond EOF in block 1472 of relation base/5/16387
> > > HINT: Did anything besides PostgreSQL touch that file?
>
> FWIW, I think these are all just about equally wrong.
> 1) doesn't allow the use to understand what could be the culprit
Well, that's pretty easy: tablespace relations were overwritten live
(PITR on the same host, w/o tablespace remapping). This assumes you
know that this restore is happening in the first place.
> 2*) omit that zero_damaged_pages can cause this due to the logic in mdreadv()
Saw 00066aa173 [1], but zero_damaged_pages use is non-existent
(outside of handling corruption cases), right?
> > > Another question is should we back-patch this? I believe we should (?)
> >
> > I don't think this qualifies as a bug. The current wording isn't
> > factually wrong, just unhelpful.
I think it is highly misleading and not up to modern times, it
certainly had value in the past.
I cannot comment from others perspective, but it has sent me in the
past into literally cross-checking if Linux's
lseek() system call vector has not been replaced by some LKMs (in some
cases it was...).
So yes I agree, it would be *better* if it wasn't present in the first
place in modern times.
> > Even if it were wrong, we need a
> > pretty good reason to change message strings in a stable branch,
> > because that can break things for users who are grepping for the
> > current string (or a translation thereof). If an overwhelming
> > consensus in favor of back-patching emerges, fine, but my gut feeling
> > is that back-patching will make more people sad than it makes happy.
>
> I'd certainly not backpatch.
There goes my plan... I would recommend backpatching, but I'm alone
and outvoted by more experienced people :^)
OK, so attached is a small patch to eradicate this HINT: CI tested,
verified using original reproducer, registered in cf app.
-J.
[1] - https://github.com/postgres/postgres/commit/00066aa1733d84109f7569a7202c3915d8289d3a
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Remove-HINT-message-for-unexpected-data-beyond-EO.patch | application/octet-stream | 13.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Bertrand Drouvot | 2025-04-04 10:57:55 | Re: rename pg_log_standby_snapshot |
Previous Message | Amit Kapila | 2025-04-04 10:45:23 | Re: Improve error reporting for few options in pg_createsubscriber |