From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | "Wood, Dan" <hexpert(at)amazon(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, pgsql-committers(at)postgresql(dot)org, "Wong, Yi Wen" <yiwong(at)amazon(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple |
Date: | 2017-12-06 20:23:55 |
Message-ID: | 20171206202355.byfh6g3bezy4rrcr@alvherre.pgsql |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers pgsql-hackers |
Andres Freund wrote:
> I've played around quite some with the attached patch. So far, after
> applying the second patch, neither VACUUM nor VACUUM FULL / CLUSTER make
> the situation worse for already existing corruption. HOT pruning can
> change the exact appearance of existing corruption a bit, but I don't
> think it can make the corruption meaningfully worse. It's a bit
> annoying and scary to add so many checks to backbranches but it kinda
> seems required. The error message texts aren't perfect, but these are
> "should never be hit" type elog()s so I'm not too worried about that.
Looking at 0002: I agree with the stuff being done here. I think a
couple of these checks could be moved one block outerwards in term of
scope; I don't see any reason why the check should not apply in that
case. I didn't catch any place missing additional checks.
Despite these being "shouldn't happen" conditions, I think we should
turn these up all the way to ereports with an errcode and all, and also
report the XIDs being complained about. No translation required,
though. Other than those changes and minor copy editing a commit
(attached), 0002 looks good to me.
I started thinking it'd be good to report block number whenever anything
happened while scanning the relation. The best way to go about this
seems to be to add an errcontext callback to lazy_scan_heap, so I attach
a WIP untested patch to add that. (I'm not proposing this for
back-patch for now, mostly because I don't have the time/energy to push
for it right now.)
It appears that you got all the places that seem to reasonably need
additional checks.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment | Content-Type | Size |
---|---|---|
0001-tweaks-for-0002.patch | text/plain | 6.5 KB |
0002-blkno-rel-context-info-for-lazy_scan_heap.patch | text/plain | 3.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2017-12-06 20:28:12 | Re: pgsql: When VACUUM or ANALYZE skips a concurrently dropped table, log i |
Previous Message | Bossart, Nathan | 2017-12-06 19:23:57 | Re: pgsql: When VACUUM or ANALYZE skips a concurrently dropped table, log i |
From | Date | Subject | |
---|---|---|---|
Next Message | Walter Cai | 2017-12-06 20:26:26 | Accessing base table relational names via RelOptInfo |
Previous Message | Bossart, Nathan | 2017-12-06 19:23:57 | Re: pgsql: When VACUUM or ANALYZE skips a concurrently dropped table, log i |