From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com> |
Cc: | Antonin Houska <ah(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl> |
Subject: | Re: AIO v2.5 |
Date: | 2025-03-25 15:57:58 |
Message-ID: | rdml3fpukrqnas7qc5uimtl2fyytrnu6ymc2vjf2zuflbsjuul@hyizyjsexwmm |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2025-03-25 07:11:20 -0700, Noah Misch wrote:
> On Mon, Mar 24, 2025 at 10:52:19PM -0400, Andres Freund wrote:
> > Is it actually sane to use WARNING here? At least for ZERO_ON_ERROR that could
> > trigger a rather massive flood of messages to the client in a *normal*
> > situation. I'm thinking of something like an insert extending a relation some
> > time after an immediate restart and encountering a lot of FSM corruption (due
> > to its non-crash-safe-ness) during the search for free space and the
> > subsequent FSM vacuum. It might be ok to LOG that, but sending a lot of
> > WARNINGs to the client seems not quite right.
>
> Orthogonal to AIO, I do think LOG (or even DEBUG1?) is better for
> ZERO_ON_ERROR. The ZERO_ON_ERROR case also should not use
> ERRCODE_DATA_CORRUPTED. (That errcode shouldn't appear for business as usual.
> It should signify wrong or irretrievable query results, essentially.)
I strongly agree on the errcode - basically makes it much harder to use the
errcode to trigger alerting. And we don't have any other way to do that...
I'm, obviously, positive on not using WARNING for ZERO_ON_ERROR. I'm neutral
on LOG vs DEBUG1, I can see arguments for either.
> For zero_damaged_pages, WARNING seems at least defensible, and
> ERRCODE_DATA_CORRUPTED is right. It wouldn't be the worst thing to change
> zero_damaged_pages to LOG and let the complete_shared runner log it, as long
> as we release-note that. It's superuser-only, and the superuser can learn to
> check the log. One typically should use zero_damaged_pages in one session at
> a time, so the logs won't be too confusing.
It's obviously tempting to go for that, I'm somewhat undecided what the best
way is right now. There might be a compromise, see below:
> > If we want to implement it, I think we could introduce PGAIO_RS_WARN, which
> > then could tell the stager to issue the WARNING. It would add a bit of
> > distributed cost, both to callbacks and users of AIO, but it might not be too
> > bad.
FWIW, I prototyped this, it's not hard.
But it can't replace the current WARNING with 100% fidelity: If we read 60
blocks in a single smgrreadv, we today would would emit 60 WARNINGs. But we
can't encode that many block offset in single PgAioResult, there's not enough
space, and enlarging it far enough doesn't seem to make sense either.
What we *could* do is to emit one WARNING for each bufmgr.c smgrstartreadv(),
with that warning saying that there were N zeroed blocks in a read from block
N to block Y and a HINT saying that there are more details in the server log.
> Another thought on complete_shared running on other backends: I wonder if we
> should push an ErrorContextCallback that adds "CONTEXT: completing I/O of
> other process" or similar, so people wonder less about how "SELECT FROM a" led
> to a log message about IO on table "b".
I've been wondering about that as well, and yes, we probably should.
I'd add the pid of the backend that started the IO to the message - although
need to check whether we're trying to keep PIDs of other processes from
unprivileged users.
I think we probably should add a similar, but not equivalent, context in io
workers. Maybe "I/O worker executing I/O on behalf of process %d".
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Noah Misch | 2025-03-25 15:58:08 | Re: AIO v2.5 |
Previous Message | Christoph Berg | 2025-03-25 15:41:06 | Re: query_id: jumble names of temp tables for better pg_stat_statement UX |