Re: "page is not marked all-visible" warning in regression tests

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: "page is not marked all-visible" warning in regression tests
Date: 2012-06-06 18:19:15
Message-ID: CA+Tgmoaa+pS_6kr80PaBnKLxbMprU7y_94VWV-EU01=NBxNz6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 6, 2012 at 1:46 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On a cursory lock it might just be a race condition in
> vacuumlazy.c:lazy_scan_heap. If scan_all is set, which it has to be for the
> warning to be visible, all_visible_according_to_vm is determined before we
> loop over all blocks. At the point where one specific heap block is actually
> read and locked that knowledge might be completely outdated by any concurrent
> backend. Am I missing something?

No, I think you're right. I think that warning is bogus. I added it
in place of some older warning which no longer made sense, but I think
this one doesn't make sense either.

> I have to say the whole visibilitymap correctness and crash-safety seems to be
> quite under documented, especially as it seems to be somewhat intricate (to
> me). E.g. not having any note why visibilitymap_test doesn't need locking. (I
> guess the theory is that a 1 byte read will always be consistent. But how does
> that ensure other backends see an up2date value?).

It's definitely intricate, and it's very possible that we should have
some more documentation. I am not sure exactly what and where, but
feel free to suggest something.

visibilitymap_test() does have a comment saying that:

/*
* We don't need to lock the page, as we're only looking at a
single bit.
*/

But that's a bit unsatisfying, because, as you say, it doesn't address
the question of memory-ordering issues. I think that there's no
situation in which it causes a problem to see the visibility map bit
as unset when in reality it has just recently been set by some other
back-end. It would be bad if someone did something like:

if (visibilitymap_test(...))
visibilitymap_clear();

...because then memory-ordering issues could cause us to accidentally
fail to clear the bit. No one should be doing that, though; the
relevant locking and conditional logic is built directly into
visibilitymap_clear().

On the flip side, if someone sees the visibility map bit as set when
it's actually just been cleared, that could cause a problem - most
seriously, index-only scans could return wrong answers. For that to
happen, someone would have to insert a heap tuple onto a previously
all-visible page, clearing the visibility map bit, and then insert an
index tuple; concurrently, some other backend would need to see the
index tuple but not the fact that the visibility map bit had been
cleared. I don't think that can happen: after inserting the heap
tuple, the inserting backend would release buffer content lock, which
acts as a full memory barrier; before reading the index tuple, the
index-only-scanning backend would have to take the content lock on the
index buffer, which also acts as a full memory barrier. So the
inserter can't do the writes out of order, and the index-only-scanner
can't do the reads out of order, so I think it's safe.... but we
probably do need to explain that somewhere.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2012-06-06 18:26:12 Re: pg_receivexlog and feedback message
Previous Message Fujii Masao 2012-06-06 18:10:18 Re: incorrect handling of the timeout in pg_receivexlog