Re: AIO v2.5

From: Noah Misch <noah(at)leadboat(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, Antonin Houska <ah(at)cybertec(dot)at>
Subject: Re: AIO v2.5
Date: 2025-03-29 13:41:43
Message-ID: 20250329134143.ca.nmisch@google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 28, 2025 at 11:35:23PM -0400, Andres Freund wrote:
> The number of combinations is annoyingly large. It's e.g. plausible to use
> ignore_checksum_failure=on and zero_damaged_pages=on at the same time for
> recovery.

That's intricate indeed.

> But I finally got to a point where the code ends up readable, without undue
> duplication. It would, leaving some nasty hack aside, require a
> errhint_internal() - but I can't imagine a reason against introducing that,
> given we have it for the errmsg and errhint.

Introducing that is fine.

> Here's the relevant code:
>
> /*
> * Treat a read that had both zeroed buffers *and* ignored checksums as a
> * special case, it's too irregular to be emitted the same way as the other
> * cases.
> */
> if (zeroed_any && ignored_any)
> {
> Assert(zeroed_any && ignored_any);
> Assert(nblocks > 1); /* same block can't be both zeroed and ignored */
> Assert(result.status != PGAIO_RS_ERROR);
> affected_count = zeroed_or_error_count;
>
> ereport(elevel,
> errcode(ERRCODE_DATA_CORRUPTED),
> errmsg("zeroing %u pages and ignoring %u checksum failures among blocks %u..%u of relation %s",
> affected_count, checkfail_count, first, last, rpath.str),

Translation stumbles on this one, because each of the first two %u are
plural-sensitive. I'd do one of:

- Call ereport() twice, once for zeroed pages and once for ignored checksums.
Since elevel <= ERROR here, that doesn't lose the second call.

- s/pages/page(s)/ like msgid "There are %d other session(s) and %d prepared
transaction(s) using the database."

- Something more like the style of VACUUM VERBOSE, e.g. "INTRO_TEXT: %u
zeroed, %u checksums ignored". I've not written INTRO_TEXT, and this
doesn't really resolve pluralization. Probably don't use this option.

> affected_count > 1 ?
> errdetail("Block %u held first zeroed page.",
> first + first_off) : 0,
> errhint("See server log for details about the other %u invalid blocks.",
> affected_count + checkfail_count - 1));
> return;
> }
>
> /*
> * The other messages are highly repetitive. To avoid duplicating a long
> * and complicated ereport(), gather the translated format strings
> * separately and then do one common ereport.
> */
> if (result.status == PGAIO_RS_ERROR)
> {
> Assert(!zeroed_any); /* can't have invalid pages when zeroing them */
> affected_count = zeroed_or_error_count;
> msg_one = _("invalid page in block %u of relation %s");
> msg_mult = _("%u invalid pages among blocks %u..%u of relation %s");
> det_mult = _("Block %u held first invalid page.");
> hint_mult = _("See server log for the other %u invalid blocks.");

For each hint_mult, we would usually use ngettext() instead of _(). (Would be
errhint_plural() if not separated from its ereport().) Alternatively,
s/blocks/block(s)/ is fine.

> }
> else if (zeroed_any && !ignored_any)
> {
> affected_count = zeroed_or_error_count;
> msg_one = _("invalid page in block %u of relation %s; zeroing out page");
> msg_mult = _("zeroing out %u invalid pages among blocks %u..%u of relation %s");
> det_mult = _("Block %u held first zeroed page.");
> hint_mult = _("See server log for the other %u zeroed blocks.");
> }
> else if (!zeroed_any && ignored_any)
> {
> affected_count = checkfail_count;
> msg_one = _("ignoring checksum failure in block %u of relation %s");
> msg_mult = _("ignoring %u checksum failures among blocks %u..%u of relation %s");
> det_mult = _("Block %u held first ignored page.");
> hint_mult = _("See server log for the other %u ignored blocks.");
> }
> else
> pg_unreachable();
>
> ereport(elevel,
> errcode(ERRCODE_DATA_CORRUPTED),
> affected_count == 1 ?
> errmsg_internal(msg_one, first + first_off, rpath.str) :
> errmsg_internal(msg_mult, affected_count, first, last, rpath.str),
> affected_count > 1 ? errdetail_internal(det_mult, first + first_off) : 0,
> affected_count > 1 ? errhint_internal(hint_mult, affected_count - 1) : 0);
>
> Does that approach make sense?

Yes.

> What do you think about using
> "zeroing invalid page in block %u of relation %s"
> instead of
> "invalid page in block %u of relation %s; zeroing out page"

I like the replacement. It moves the important part to the front, and it's
shorter.

> I thought about instead translating "ignoring", "ignored", "zeroing",
> "zeroed", etc separately, but I have doubts about how well that would actually
> translate.

Agreed, I wouldn't have high hopes for that. An approach like that would
probably need messages that separate the independently-translated part
grammatically, e.g.:

/* last %s is translation of "ignore" or "zero-fill" */
"invalid page in block %u of relation %s; resolved by method \"%s\""

(Again, I'm not recommending that.)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2025-03-29 14:08:06 Re: Thread-safe nl_langinfo() and localeconv()
Previous Message Kirill Reshke 2025-03-29 12:38:15 Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).