From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com> |
Subject: | Re: xid wraparound danger due to INDEX_CLEANUP false |
Date: | 2020-04-23 03:08:42 |
Message-ID: | CAH2-Wz=WRu6NMWtit2weDnuGxdsWeNyFygeBP_zZ2Sso0YAGFg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Apr 22, 2020 at 6:05 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> (It would be nice if we could teach Valgrind to "poison" buffers when
> we don't have a pin held...that would probably have caught this issue
> almost immediately.)
I can get Valgrind to complain about it when the regression tests are
run with the attached patch applied. I see this in the logs at several
points when "make installcheck" runs:
==1082059== VALGRINDERROR-BEGIN
==1082059== Invalid read of size 4
==1082059== at 0x21D8DE: btvacuumpage (nbtree.c:1370)
==1082059== by 0x21DA61: btvacuumscan (nbtree.c:1039)
==1082059== by 0x21DBD5: btbulkdelete (nbtree.c:879)
==1082059== by 0x215821: index_bulk_delete (indexam.c:698)
==1082059== by 0x20FDCE: lazy_vacuum_index (vacuumlazy.c:2427)
==1082059== by 0x2103EA: lazy_vacuum_all_indexes (vacuumlazy.c:1794)
==1082059== by 0x211EA1: lazy_scan_heap (vacuumlazy.c:1681)
==1082059== by 0x211EA1: heap_vacuum_rel (vacuumlazy.c:510)
==1082059== by 0x360414: table_relation_vacuum (tableam.h:1457)
==1082059== by 0x360414: vacuum_rel (vacuum.c:1880)
==1082059== by 0x361785: vacuum (vacuum.c:449)
==1082059== by 0x361F0E: ExecVacuum (vacuum.c:249)
==1082059== by 0x4D979C: standard_ProcessUtility (utility.c:823)
==1082059== by 0x4D9C7F: ProcessUtility (utility.c:522)
==1082059== by 0x4D6791: PortalRunUtility (pquery.c:1157)
==1082059== by 0x4D725F: PortalRunMulti (pquery.c:1303)
==1082059== by 0x4D7CEF: PortalRun (pquery.c:779)
==1082059== by 0x4D3BB7: exec_simple_query (postgres.c:1239)
==1082059== by 0x4D4ABD: PostgresMain (postgres.c:4315)
==1082059== by 0x45B0C9: BackendRun (postmaster.c:4510)
==1082059== by 0x45B0C9: BackendStartup (postmaster.c:4202)
==1082059== by 0x45B0C9: ServerLoop (postmaster.c:1727)
==1082059== by 0x45C754: PostmasterMain (postmaster.c:1400)
==1082059== by 0x3BDD68: main (main.c:210)
==1082059== Address 0x6cc7378 is in a rw- anonymous segment
==1082059==
==1082059== VALGRINDERROR-END
(The line numbers might be slightly different to master here, but the
line from btvacuumpage() is definitely the one that accesses the
special area of the B-Tree page after we drop the pin.)
This patch is very rough -- it was just the first thing that I tried.
I don't know how Valgrind remembers the status of shared memory
regions across backends when they're marked with
VALGRIND_MAKE_MEM_NOACCESS(). Even still, this idea looks promising. I
should try to come up with a committable patch before too long.
The good news is that the error I showed is the only error that I see,
at least with this rough patch + "make installcheck". It's possible
that the patch isn't as effective as it could be, though. For one
thing, it definitely won't detect incorrect buffer accesses where a
pin is held but a buffer lock is not held. That seems possible, but a
bit harder.
--
Peter Geoghegan
Attachment | Content-Type | Size |
---|---|---|
0002-Valgrind-instrumentation.patch | application/octet-stream | 3.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2020-04-23 03:13:50 | Re: PG compilation error with Visual Studio 2015/2017/2019 |
Previous Message | David Rowley | 2020-04-23 02:50:22 | Re: create partition table caused server crashed with self-referencing foreign key |