From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Himanshu Upadhyaya <upadhyaya(dot)himanshu(at)gmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> |
Subject: | Re: HOT chain validation in verify_heapam() |
Date: | 2023-03-23 20:36:56 |
Message-ID: | 20230323203656.le7thulot4zrzi6v@awork3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2023-03-23 15:37:15 -0400, Robert Haas wrote:
> On Wed, Mar 22, 2023 at 8:38 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > skink / valgrind reported in a while back and found another issue:
> >
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2023-03-22%2021%3A53%3A41
> >
> > ==2490364== VALGRINDERROR-BEGIN
> > ==2490364== Conditional jump or move depends on uninitialised value(s)
> > ==2490364== at 0x11D459F2: check_tuple_visibility (verify_heapam.c:1379)
> ...
> > ==2490364== Uninitialised value was created by a stack allocation
> > ==2490364== at 0x11D45325: check_tuple_visibility (verify_heapam.c:994)
>
> OK, so this is an interesting one. It's complaining about switch
> (xmax_status), because the get_xid_status(xmax, ctx, &xmax_status)
> used in the previous switch might not actually initialize xmax_status,
> and apparently didn't in this case. get_xid_status() does not set
> xmax_status except when it returns XID_BOUNDS_OK, and the previous
> switch falls through both in that case and also when get_xid_status()
> returns XID_INVALID. That seems like it must be the issue here. As far
> as I can see, this isn't related to any of the recent changes but has
> been like this since this code was introduced, so I'm a little
> confused about why it's only causing a problem now.
Could it be that the tests didn't exercise the path before?
> Nonetheless, here's a patch. I notice that there's a similar problem
> in another place, too. get_xid_status() is called a total of five
> times and it looks like only three of them got it right. I suppose
> that if this is correct we should back-patch it.
Yea, I think you're right.
> + report_corruption(ctx,
> + pstrdup("xmin is invalid"));
Not a correctnes issue: Nearly all callers to report_corruption() do a
psprintf(), the remaining a pstrdup(), as here. Seems like it'd be cleaner to
just make report_corruption() accept a format string?
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Stark | 2023-03-23 20:41:39 | Re: Commitfest 2023-03 starting tomorrow! |
Previous Message | Tom Lane | 2023-03-23 20:35:46 | Re: Progress report of CREATE INDEX for nested partitioned tables |